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NOVEL Eph- RELATED TYROSINE KINASES , NUCLEOTIDE 
SEQUENCES AND METHODS OF USE 



This invention was funded in part by NIH Grants 
HD 26351 and CA 56721. Accordingly, the United States 
5 government has certain rights in the invention. 



30 



BACKGROUND OF THE 



This invention relates generally to protein 
tyrosine kinases and, more particularly, to Eph- related 
receptor tyrosine kinases and their manipulation for the 
10 control of cellular processes. 

Receptor tyrosine kinases comprise a large family 
of proteins that share a number of structural features such 
as a glycosylated extracellular ligand-binding domain, a 
hydrophobic transmembrane domain and a conserved 

15 cytoplasmic catalytic domain. Integral membrane tyrosine 
kinases have been shown to mediate cellular signals 
important for growth and differentiation. The transduction 
of many extracellular signals to the cytoplasm occurs as a 
result of the binding of ligands such as growth factors, 

20 for example, to receptor tyrosine kinases at the cell 
surface. ■ In most cases, ligand binding activates the 
cytoplasmic tyrosine kinase catalytic domain and culminates 
in tyrosine phosphorylation of multiple substrates in the 
cytoplasm. 

25 Increased expression of membrane -spanning 

receptor tyrosine kinases frequently has been associated 
with alterations in normal cellular processes. The 
affected cellular processes include cell proliferation, 
differentiation and cancer, including, for example, human 
cancers. Specific examples of such cancers can include 
glioblastomas, squamous carcinomas and mammary carcinomas, 
which are associated with the amplification of the EGF 
receptor gene. Adenocarcinomas, breast cancers and gastric 
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cancers similarly are associated with aberrant expression 
of the HER2/neu receptor and certain breast carcinomas 
overexpress the erbB-3 gene, for example. 

The correlation between aberrant expression and 
5 transforming ability also extends to members of the Eph 
subclass of receptor tyrosine kinases. For example, 
carcinomas of the liver, lung, breast and colon show 
elevated expression of Eph. Unlike many other tyrosine 
kinases, this elevated expression can occur in the absence 

10 of gene amplification or rearrangement . Such involvement 
of Eph in carcinogenesis also has been shown by the 
formation of foci of NIH 3T3 cells in soft agar and of 
tumors in nude mice following overexpression of Eph. 
Moreover, an antigen present on the surface of a pre-B cell 

15 leukemia cell line also has been identified as a member of 
the Eph subclass . Wicks et al . , Proc . Natl . Acad . Sci . . 
USA 89:1611-1615 (1992). This leukemia -specific marker, 
termed Hek, appears to be similar to the chicken Cek4 and 
mouse Mek4 of the Eph subclass of receptor tyrosine kinases 

20 {see Sajjadi et al . , The New Biologist 3:769-778 (1991), 
which is incorporated herein by reference) . As with Eph, 
Hek also was overexpressed in the absence of gene 
amplification or rearrangements in, for example, 
hemopoietic tumors and lymphoid tumor cell lines. 

25 In addition to their roles in carcinogenesis, a 

number of transmembrane tyrosine kinases have been reported 
to play key roles during development . Examples include the 
mouse c-kit proto- oncogene and the Droeophila genes 
"sevenless" and "torso, w which are involved in pattern 

30 formation. Consistent with this developmental role, many 
receptor tyrosine kinases other than those described above 
also have been shown to be development ally regulated and 
predominantly expressed in embryonic tissues. Examples of 
these other tyrosine kinases include Cekl, which belongs to 

35 the FGF subclass, and the Cek4 and Cek5 tyrosine kinases 
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(Pasquale et al., Proc. Natl. Acad. Sci . , USA 86:5449-5453 
(1989); Sajjadi et al . , supra, (1991); and Pasquale, E.B., 

Cell Reg. 2:523-534 (1991) , all of which are incorporated 

herein by reference) . 

5 Eph was the first member of the Eph subclass of 

tyrosine kinases to be identified and characterized by 
molecular cloning (Hirai et al . , Science 238:1717-1720 
(1987) ) . The name Eph is derived from the name of the cell 
line from which the Eph cDNA was first isolated, the 

10 erythropoietin-producing human hepatocellular carcinoma 
cell line, ETL-1. The general structure of Eph is similar 
to that of other receptor tyrosine kinases and consists of 
an extracellular domain, a single membrane spanning region 
and a conserved tyrosine kinase catalytic domain. However, 

15 the structure of the extracellular domain of Eph, which 
comprises an immunoglobulin (Ig) domain at the amino 
terminus, followed by a cysteine-rich region and two 
fibronectin type III repeats in close proximity to the 
transmembrane domain, is completely distinct from that of 

20 previously described receptor tyrosine kinases. The 
juxtamembrane domain and carboxy- terminus regions of Eph 
also are unrelated to the corresponding regions of other 
tyrosine kinase receptors. Thus, the discovery of Eph 
defined a new subclass of receptor- type tyrosine kinases. 

25 In addition to the isolation and characterization 

of Eph, other related tyrosine kinases now have been 
identified. Cek4 and Cek5 were identified by screening a 
chicken embryo cDNA expression library with anti- 
phosphotyrosine antibodies (Sajjadi et al., supra, (1991) 

30 and Pasquale, supra, (1991)). This method of 

identification was successful because Cek4 and Cek5 are 
expressed in embryonic tissues and have tyrosine kinase 
activity even when expressed as partial fragments in 
bacteria. Other Eph-related kinases that have been 

35 identified include Hek (Wicks et al . , supra, (1992)), Sek 
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(Gilardi-Hebenstreit et al, Oncogene 7:2499-2506 (1992)), 
Bck (Lindberg and Hunter, Mol. Cell. Biol, 10:6316-6324 

(1990) ), Elk (Lhotak et al., Mol. Cell. Biol. 11:2496-2502 

(1991) ) and Eek (Chan and Watt, Oncogene 6:1057-1061 
5 (1991)). These tyrosine kinases were cloned using a 

variety of methods. 

The number of existing Eph-related kinases is not 
known and cannot be predicted. However, the Eph subclass 
already represents the largest known subclass of receptor 

10 tyrosine kinases, comprising at least 10 distinct members. 
The kinases belonging to the Eph subclass are so classified 
because each includes features such as the amino terminal 
Ig domain, the cysteine-rich stretch and two fibronectin 
type III repeats in the extracellular domain, which are 

15 conserved within the Eph subclass. However, despite these 
common structural features, the overall amino acid, 
sequences outside the catalytic domain are quite different, 
indicating that different members of the Eph subclass 
interact with distinct ligands and substrates and, thus, 

20 exert distinct functions. This notion is supported by the 
differential distribution of different Eph-related kinases 
in adult tissues. 

There is no indication whether other Eph-related 
kinases exist and, if so, what their relationship is to the 

25 known Eph-related kinases. Nevertheless, despite 

similarities among the Eph-related receptor tyrosine 
kinases, each is different and, as such, functions in 
related but distinct cellular processes. For example, 
many members of the Eph subclass are expressed in the 

30 nervous system during development and thus are likely to be 
involved in nerve regeneration processes. The aberrant 
expression or uncontrolled regulation of any one of these 
receptor tyrosine kinases can result in different 
malignancies and pathological disorders. Therefore, the 

35 identification and characterization of novel transmembrane 
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tyrosine kinases should provide important insights into the 
mechanisms underlying oncogenesis and cellular growth 
control pathways. 

There thus exists a need to identify additional 
5 receptor tyrosine kinases and to manipulate them in order 
to diagnose pathological conditions and control cellular 
processes. The present invention satisfies this need and 
provides related advantages as well. 

SUMMARY 

10 The invention is directed to substantially 

purified Eph-related protein tyrosine kinases, or 
functional fragments thereof, having about 23 to 66 percent 
amino acid sequence identity in their carboxyl terminal 
variable region compared to the other known members of the 

15 Eph subclass of tyrosine kinases. Nucleic acids encoding 
such Eph-related protein tyrosine kinases, vectors and host 
cells also are provided. The invention also is directed to 
a method of diagnosing cancer. The method includes 
removing a tissue or cell sample from a subject suspected 

20 of having cancer and determining the level of Eph-related 
protein tyrosine kinase in the sample, wherein a change in 
the level or activity of a Eph-related protein tyrosine 
kinase compared to a normal sample indicates the presence 
of a cancer or correlates with a specific prognosis. 

25 BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 shows a comparison of the amino acid 
sequences from members of the Eph family. Dots replace 
residues in Cek4 (SEQ ID NO: 16), Cek6 (SEQ ID NO: 2), Cek7 
(SEQ ID NO: 4), Cek8 (SEQ ID NO: 6), Cek9 (SEQ ID NO: 8), 
30 CeklO (SEQ ID NO: 10), Eck and Eph that are identical to 
the corresponding residue in Cek5 (SEQ ID NO: 18) . Dashes 
represent gaps introduced in the sequences to aid in the 
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alignment. The insertion sequence of Cek5 also is 
presented (Cek5 + ; SEQ ID NO: 12) and the insertion sequences 
of Cek7* (SEQ ID NO: 20) and Cekl0 + (SEQ ID NO: 14) are in 
parentheses. The conserved cysteines are indicated by the 
5 symbol and the kinase domain is delimited by arrows. 
Open circles indicate the hydrophobic and aromatic residues 
that are conserved in the first fibronectin type III repeat 
and asterisks indicate the conserved residues of the second 
fibronectin type III repeat. The filled circle indicates 

10 the site of putative tyrosine aut ©phosphorylation in the 
catalytic domain. The putative signal peptide sequences 
and transmembrane domains are underlined. Amino acids are 
numbered at the left of the sequences. The symbol + 
indicates the location of the extracellular domain amino 

15 acid insertion RICTPDVSGTVGSRPAADH (SEQ ID NO: 23), 
corresponding to Cek6 amino acids 426.-444. Alignments were 
made by eye in the regions corresponding to Cek5 residues 
1-615 and using the program DFALIGN (Feng and Doolittle, J. 
Mol . Evol . . 25:351-360 (1987), which is incorporated herein 

20 by reference) in the regions corresponding to Cek5 residues 
616-995. 

Figure 2 shows a RNA blot analysis of Cek mRNAs . 
Polyadenylated chicken RNA from 10 -day embryonic and adult 
tissues was hybridized with Cek-specific cDNA probes and 
25 with a chicken 0-actin probe. Hybridization conditions 
were as described in Example I. The positions of RNA 
molecular weight standards (in kilobases, kb) are indicated 
on the right. 0-actin transcripts are present in the -2.0 
kb size range. 

30 Figure 3 shows a RNA blot analysis of Cek5 mRNAs. 

Polyadenylated RNA from body tissues (lanes 1 and 2) and 
brain (lanes 3 and 4) of lO^day chicken embryos was 
hybridized with a Cek5-specific cDNA (lanes 1 and 3). The 
same blots were then stripped and rehybridized with a 48 bp 

35 oligonucleotide antisense probe corresponding to the 
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juxtamembrane insertion sequence of Cek5 (lanes 2 and 4) . 
Hybridization conditions were as described in Example I. 
The positions of RNA molecular weight standards (in kb) are 
indicated on the right. 

5 Figure 4 shows immunoblotting with antibodies to 

different Eph-related kinases. Fractions from 10-day 
embryonic brain containing either membrane -associated 
proteins (M) or soluble proteins (S) were probed with anti- 
Cek4 (4), anti-Cek8 (8,) or anti-Cek9 (9) antibodies. 
10 Equal amounts of protein were loaded in all the lanes. IP, 
immunoprecipitates from 11 -day embryonic retina with anti- 
Cek8 antibodies (8) or with normal rabbit IgGs (Ig) . The 
immunoprecipitates were then probed with anti-Cek8 
antibodies. 

15 Figures 5. A. to 5.D. show the expression and 

tyrosine phosphorylation of Cek8 and Cek5 in transformed 
cell lines. Cell lysates were prepared from the rat 
central nervous system (CNS) tumor-derived cell lines B23, 
B28, B35, B49 and B50, the mouse embryonic carcinoma cell 

20 line, P19, and the human keratinocyte cell line HaCaT (Ha) . 
Panels A and B show immunoprecipitates with anti-Cek8 
antibodies. Panels C and D show immunoprecipitates with 
anti-Cek5 antibodies. The immunoprecipitation was followed 
by in vitro kinase reaction in the samples shown in panel 

25 C. The immunoblot in panel A was probed with anti-Cek8 
antibodies. The immunoblots in panels B, C and D were 
probed with anti-phosphotyrosine antibodies. 

Figures 6. A. to 6.F. demonstrate that Cek8 
phosphorylation on tyrosine is increased in transformed 
30 cells and correlates with increased in vitro catalytic 
activity. Lysates from LMH cells and extracts of 10 day 
embryonic liver and adult liver were immunoprecipitated 
with ahti-Cek8 antibodies, probed with anti-phosphotyrosine 
antibodies (panel A) , then reprobed with anti-Cek8 
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antibodies (panel B) . Lysates from normal chicken embryo 
fibroblasts and Rous sarcoma virus transformed chicken 
embryo fibroblasts were imraunoprecipitated with anti-Cek8 
antibodies, probed with anti-phosphotyrosine antibodies 
5 (panel C) and reprobed with anti-Cek8 antibodies (panel D) . 

Panels E and F show immunoblots of 
immunoprecipitated Cek8 (lane 1) or S-galactosidase-Cek4 
fusion protein substrate (lanes 2-5) . The fusion protein 
was phosphorylated for 1 min at 37 °C by Cek8 (lane 2) , 1 
10 min at 37 °C by tyrosine phosphorylated Cek8 (lane 3) , 1 
min at 0 °C by Cek8 (lane 4), 1 min at 0 °C by tyrosine 
phosphorylated Cek8 (lane 5) . Immunoblots were probed with 
anti-phosphotyrosine antibodies (panel E) and reprobed with 
anti-E-galactosidase antibodies (panel F) . 

15 DETAILED DESCRIPTION OF THE INVENTION 

The invention relates to the identification and 
characterization of seven novel members of the Eph subclass 
of membrane -spanning tyrosine kinases. The identification 
of these members doubles the number of kinases within this 

20 subclass, bringing the total to at least ten different Eph- 
related kinases. These Eph-related kinases therefore 
comprise the largest known subclass of integral membrane 
tyrosine kinases. The large number of different Eph- 
related kinases indicates that these receptors regulate a 

25 number of distinct cellular processes during development as 
well as in the adult organism. Therefore, identification 
of novel proteins within this subclass and isolation of 
their encoding nucleic acids allows the control of 
different cellular processes through the production of 

30 specific agonists and antagonists and through genetic 
therapy. 

In one embodiment seven novel kinases of the Eph 
subclass of receptor protein tyrosine kinases have been 
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identified. The cDNAs encoding these Eph-related kinases 
were identified by hybridization at differential 
stringencies to identify distinct, but related receptor 
tyrosine kinases. All of the kinases exhibit gross 
5 structural features of known receptor tyrosine kinases in 
that they contain an extracellular ligand binding domain, 
a transmembrane domain and a cytoplasmic catalytic domain. 
These novel kinases are related to the Eph subclass of 
receptor tyrosine kinases and are designated Cek6 through 
10 Cekl0 + (SEQ ID NOS: 1 to 14 , and 19 to 22.) The overall 
sequence identity between these Eph-related kinases varies 
significantly with each of the novel Eph-related receptors 
being identified by its carboxyl terminal variable region. 

In another embodiment, the novel Eph-related 

15 kinases exhibit distinct tissue distribution patterns and 
developmental expression. Six of the kinases can be found 
to be expressed in both the embryonic brain and body 
tissues. The seventh Eph-related kinase, Cek5*, is 
expressed only in the embryonic brain. Indicative of their 

20 roles in cellular processes, such as embryonic signal 
transduction pathways, these Eph-related kinases display 
distinct patterns of expression in adult tissues, including 
the neuronal specific expression of Cek5* . These distinct 
patterns can be used to diagnose aberrations in normal 

25 cellular processes, such as those leading to uncontrolled 
malignant cell growth. For example, as described below, 
Cek8 activity is increased in various tumor cells as 
compared to normal cells. In addition to diagnosing such 
aberrations, it is also possible to treat defects caused by 

30 the unregulated expression of Eph-related kinases through 
the use of gene therapy. Reagents affecting the expression 
or activity of Eph-related kinases can also be useful for 
inducing nerve regeneration following injury. 



35 tyrosine kinase" or "Eph-related kinase" refers to a 



As used herein, the term "Eph-related protein 
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receptor tyrosine kinase having an extracellular ligand 
binding domain, a transmembrane domain and a cytoplasmic 
catalytic domain, and belonging to the Eph subclass of 
receptor tyrosine kinases. Eph-related kinases include, 
5 for example, the receptor tyrosine kinases Cek6 (SEQ ID NO: 
2), Cek7 (SEQ ID NO: 4), Cek7* (SEQ ID NO: 20), Cek7' (SEQ 
ID NO: 22), Cek8 (SEQ ID NO: 6), Cek9 (SEQ ID NO: 8), CeklO 
(SEQ ID NO: 10), Cek5 + (SEQ ID NO: 12) and Cekl0 + (SEQ ID 
NO: 14) . Such kinases exhibit an overall amino acid 

10 sequence identity to Eph of greater than about 40 percent. 
The extreme carboxyl terminal cytoplasmic regions of the 
kinases are not well conserved and can be used to 
differentiate among them. This extreme carboxyl terminal 
cytoplasmic region begins just after the catalytic domain 

15 at about residue number 900 and extends to the C- terminal 
most residue. Therefore, the term "carboxyl terminal 
variable region 1 * as used herein, refers to this extreme C- 
terminal region of the sequence which is divergent between 
the different members of the Eph subclass of tyrosine 

20 kinases. The actual sequence identities between different 
kinases within the Eph subclass are as follows: Cek5- 
CeklO: 66%; Cek5-Cek6 : 54%; Cek5-Cek9: 50%; Cek5-Cek8: 38%; 
Cek5-Cek7: 34%; Cek5-Cek4: 24%; Cek5-Eek: 39%; Cek5-Eck: 
36%; Cek5-Eph: 33%; Cekl0-Cek6: 64%; Cekl0-Cek9: 56%; 

25 Cekl0-Cek8: 47%; Cekl0-Cek7: 45%; Cekl0-Cek4 :32%; CeklO- 
Eek: 41%; CeklO-Eck: 39%; CeklO-Eph: 37%; Cek6-Cek9: 46%; 
Cek6-Cek8: 50%; Cek6-Cek7: 40%; Cek6-Cek4: 31%; Cek6-Eek: 
39%; Cek6-Eck: 36%; Cek6-Eph: 32%; Cek9-Cek8: 46%; Cek9- 
Cek7: 47%; Cek9-Cek4: 29%; Cek9-Eek: 36%; Cek9-Eck: 33%; 

30 Cek9-Eph: 35%; Cek8-Cek7: 37%; Cek8-Cek4: 26%; Cek8-Eek: 
39%; Cek8-Eck: 36%; Cek8-Eph: 30%; Cek7-Cek4 : 36%; Cek7- 
Eek: 35%; Cek7-Eck: 43%; Cek7-Eph: . 37%; Cek4-Eek: 29%; 
Cek4-Eck: 27%; Cek4-Eph: 23%; Eek-Eck: 26%; Eek-Eph: 32%; 
Eck-Eph: 52%. Therefore, the carboxyl terminal variable 

35 region exhibits an amino acid sequence identity of about 23 
to 66 percent between the different Eph-related kinases. 
The novel Eph-related kinases described herein fall within 
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this level of sequence divergence and can therefore be 
distinguished by comparison to the known members of the Eph 
subclass. Known members of this subclass include, for 
example, Eph, Cek4, Cek5, Mek4, Hek, Sek {or mouse CekB) , 
5 Eck, Elk (or rat Cek6) and Eek. 



be made without destroying biological functions of Eph- 
related kinases and that only a portion of the entire 
primary structure may be required in order to effect a 



activities can include, for example, signal transduction, 
ligand binding and/or tyrosine kinase activity. For 
example, the Eph-related kinases of the invention have 
amino acid sequences substantially similar to those shown 

15 for Cek7, Cek7\ Cek7' , Cek9, CeklO, Cek5\ CeklCT and 
chicken Cek6 and Cek8 in Figure 1 (hereinafter referred to 
as Cek6 through CeklO + ) , but minor modifications of these 
sequences which do not destroy their activity also fall 
within the definition of Eph-related kinases and within the 

20 definition of the protein claimed as such. Moreover, 
fragments of the sequences of Cek6 through CeklO* in Figure 
1 (SEQ ID NOS: 2, 4, 6, 8, 10, 12, 14, 20 and 22), which 
retain the function of the entire protein as well as 
functional domains that contain at least one function of 

25 the intact protein are included within the definition. 
Functional domains can include, for example, active ligand 
binding and catalytic domains. The boundaries of such 
domains are not important so long as activity is 
maintained. It is also understood that minor modifications 

30 of the primary amino acid sequence can result in proteins 
which have substantially equivalent or enhanced function as 
compared to the sequences set forth in Figure 1 (SEQ ID 
NOS: 2, 4, 6, 8, 10, 12, 14, 20 and 22). These 
modifications may be deliberate, as through site-directed 

35 mutagenesis, or may be accidental such as through mutation 
in hosts which produce Eph-related kinases. All of these 



It is understood that limited modifications may 



10 particular activity. 



Such biological functions and 
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modifications are included as long as biological function 
is retained. Further, various molecules can be attached to 
Eph- related kinases, for example, other proteins , 
carbohydrates, or lipids. Cek8 (SEQ ID NO: 6), for 
5 example, can contain complex N-linked oligosaccharides 
(see below) . Such modifications are included within the 
definition of Eph-related tyrosine kinase. 

The term "substantially purified, n when used to 
describe the state of Eph-related tyrosine kinases denotes 

10 the protein free of a portion of the other proteins and 
molecules normally associated with or occurring with Eph- 
related kinases in their native environment. Such 
substantially purified Eph-related kinases can be derived 
from natural sources, recombinantly expressed or 

15 synthesized by in vitro methods so long as some portion of 
normally associated molecules is absent. 

"Isolated" when used to describe the state of the 
nucleic acids encoding Eph-related tyrosine kinases denotes 
the nucleic acids free of at least a portion of the 
20 molecules associated with or occurring with Eph-related 
nucleic acids in the native environment. 

As used herein, the term "vector" includes 
nucleic acids that are capable of harboring a natural or 
recombinant DNA sequence of interest. Vectors are usually 

25 derived from, or contain some sequences from, a natural 
source. For example, bacteriophage vectors containing 
specially engineered features that are largely derived from 
the phage's genome and are capable of carrying out some 
part of its infectious cycle. On the other hand, the 

30 sequences contained within plasmids are usually derived 
from different sources and compiled into a single molecule 
to carry out specific tasks. Thus, there are many different 
types of vectors and each is used according to the need to 
perform a desired function. Functions can include, for 
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example, propagation in a desired host, cloning recombinant 
or natural fragments of DNA, mutagenesis, expression and 
the like. In sum, "vector" is given a operative 
definition, and any DNA sequence which is capable of 
5 effecting a function of a specified DNA sequence disposed 
therein is included in this term as it is applied to the 
specified sequence. 

The invention provides a substantially purified 
Eph-related protein tyrosine kinase, or functional fragment 

10 thereof. Also provided is a substantially purified chicken 
Eph-related protein tyrosine kinase. The substantially 
purified Eph-related protein tyrosine kinase exhibits about 
23 to 66 percent amino acid sequence identity in its 
carboxyl terminal variable region compared to known members 

15 of the Eph subclass of tyrosine kinases. The amino acid 
sequences are substantially the same as that shown for Cek6 
through CeklO* in Figure 1 (SEQ ID NOS: 2, 4, 6, 8, 10, 12, 
14, 20 and 22.) 

The invention also provides an isolated nucleic 
20 acid encoding a Eph-related protein tyrosine kinase, or 
functional fragment thereof. The isolated nucleic acid 
encoding a Eph-related protein tyrosine kinase exhibits 
about 23 to 66 percent amino acid sequence identity in its 
carboxyl terminal variable region compared to known members 
25 of the Eph subclass of tyrosine kinases. The encoding 
nucleotide sequences are substantially the same as that 
shown for Cek6 (SEQ ID NO: 1) , Cek7 (SEQ ID NO: 3), Cek8 
(SEQ ID NO: 5), Cek9 (SEQ ID NO: 7), CeklO (SEQ ID NO: 9), 
Cek5 + (SEQ ID NO: 11), CeklO* (SEQ ID NO: 13), Cek7 4 (SEQ ID 
30 NO: 19) and Cek7' (SEQ ID NO: 21) (hereinafter Cek6 through 
Cekl0 + ) . 

The isolation of seven cDNAs that encode novel 
Eph-related receptor tyrosine kinases is disclosed herein. 
The predicted amino acid sequences of these Eph-related 
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kinases are shown in Figure 1 along with other known Cek 
kinase sequences and those of Eph and Eck. A number of 
conserved features serve to define the newly discovered 
kinases as members of the Eph subclass. These include an 
5 amino terminal immunoglobulin domain followed by a 
cysteine-rich stretch in the extracellular domain, with the 
position of most cysteines conserved, and sequences 
corresponding to two fibronectin type III repeats in close 
proximity to the transmembrane domain (O' Bryan et al . , Mol. 

° Cell. Biol. 11:5016-5031 (1991) and Pasquale, supra, 

(1991), the former of which is incorporated herein by 
reference) . Potential sites of N-glycosylation are 
primarily localized in the C-terminal half of the 
extracellular regions. The homologies in the extracellular 
5 domains indicates that the different members of the Eph 
family can bind a similar class of ligands. Figure 1 also 
shows that the Eph family, with the inclusion of the new 
members that have been identified, can now be considered 
the largest known family of membrane -spanning tyrosine 
20 kinases. Such a large number of tyrosine kinases in this 
one class is surprising in view of the fact that the other 
families of receptor tyrosine kinases have fewer members. 

The catalytic domains of the Eph-related kinases 
are highly conserved and exhibit amino acid identities 
ranging between 61% and 90%. The C-terminal tails are less 
conserved (Figure 1) and therefore constitute a variable 
region which can be used to specify the distinct Eph- 
related kinases. Only one of the tyrosines in the C- 
terminal variable region, corresponding to tyrosine 93 9 of 
Cek5, is conserved in all the members of the Eph family, 
with the exception of Cek4 . This conserved tyrosine 
residue represents a likely site of autophosphorylation and 
regulation, Ullrich and Schlessinger, Cell 61:203-212 
(1990). The large size of the Eph subclass of receptor 
tyrosine kinases, the variability within their sequences 
and their different tissue distributions indicate that each 
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receptor can, for example, serve distinct functions during 
cellular processes. 



of the juxtamembrane domains observed in the Eph-related 
5 kinases is unusual among tyrosine kinases belonging to the 
same subclass, Ullrich et al . , supra, 1990. Because clones 
encoding variants with amino acid insertions in the 
juxtamembrane domain were isolated for Cek5, Cek7 and 
CeklO, the variability in the lengths of the juxtamembrane 

10 domains is likely to originate by alternative splicing 
(Figure 1) . Juxtamembrane domains are important for the 
modulation of receptor functions by heterologous stimuli, 
for example, through phosphorylation by other kinases. The 
juxtamembrane domains of the members of the Eph family 

15 contain numerous serines, threonines and tyrosines that can 
serve as sites of regulation by phosphorylation, Kemp et 
al., Trends Biol . Sci . 15:342-346 (1990), which is 
incorporated herein by reference. For example, Cek9 and 
CeklO, as well as Cek5, Cek6 , and Eck contain the consensus 

20 sequence (S/T)P, which is recognized by proline -dependent 
protein kinases such as cdc2, Kemp et al . , supra, (1990). 
Juxtamembrane domains have also been indicated to be 
important in the regulation of the subcellular distribution 
of the kinase and in the binding of some substrates 

25 (Ullrich et al., supra, 1990). 



the variant form of Cek5, was shown to be specifically 
expressed in the CNS, indicating that Cek5* functions 
primarily in neuronal cellular functions. Indicative of 

30 this is another tyrosine kinase, src, which has been shown 
to encode neuronal specific variants containing 6 to 17 
amino acid insertions in the regulatory (non-catalytic) 
region (Brugge et al., Nature 316:554-557 (1985); Martinez 
et al., Science 237:411-415 (1987); Pyper et al . , Mol . 

35 Cell . Biol . 10:2035-2040 (1990), all of which are 



The variability in both the lengths and sequences 



The mRNA corresponding to Cek5 + (SEQ ID NO: 11), 
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incorporated herein by reference) . These neuronal forms of 
c-src have higher specific catalytic activity than non- 
neuronal c-src. 

Although the predicted molecular masses of the 
5 different members of the Eph family are similar, the sizes 
of their transcripts appear quite varied (4 to 10 kb) . In 
addition, several mRNA species for each of the Eph-related 
kinases, particularly in the CNS, were detected using a 
panel of probes. As described below, the patterns of 
10 expression of these novel Eph-related kinases are also 
distinct. 

DNA sequences encoding the polypeptides of Eph- 
related kinases can be obtained by methods known to one 
skilled in the art. The sequences described herein are 

15 sufficient for one skilled in the art to practice the 
invention. Such methods include, for example, cDNA 
synthesis and polymerase chain reaction (PCR) . the need 
will determine which method or combination of methods is to 
be used to obtain the desired sequence. Expression can be 

20 performed in any compatible vector /host system. Such 
systems include, for example, plasmids or phagemids in 
procaryotes such as E. coli, yeast systems and other 
eucaryotic systems such as mammalian cells. Additionally, 
the Eph-related kinases can also be expressed in soluble or 

25 secreted form depending on the need and the vector /host 
system employed. 

Such vectors and vector /host systems are known, 
or can be constructed by those skilled in the art and 
should contain all expression elements necessary for the 
30 transcription, translation, regulation, and sorting of the 
polypeptide which makes up the Eph-related kinase. Other 
beneficial characteristics may also be contained within the 
vectors such as mechanisms for recovery of the nucleic 
acids in a different form. Phagemids are a specific 
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example of this because they can be used either as plasmids 
or as bacteriophage vectors. The vectors can also be for 
use in either procaryotic or eucaryotic host systems so 
long ,as the expression elements are of a compatible origin. 
5 One of ordinary skill in the art will know which host 
systems are compatible with a particular vector. Thus, the 
invention provides vectors, host cells transformed with the 
vectors and Eph- related kinases produced from the host 
cells containing a nucleic acid encoding a Eph-related 
10 kinase. 

The invention also provides methods of diagnosing 
cancer and determining cancer prognosis. The method 
includes removing a tissue or cell sample from a subject 
suspected of having cancer and determining the level of 

15 Eph-related protein tyrosine kinase in said sample, wherein 
a change in the level or activity of a Eph-related protein 
tyrosine kinase compared to a normal sample indicates the 
presence of a cancer or indicates the level of malignancy 
of a cancer and, therefore, the most appropriate course of 

20 treatment. 

As stated previously, receptor tyrosine kinases 
are involved in many signal transduction events that 
regulate important cellular processes. Such processes 
include, for example, cellular differentiation and 

25 proliferation. Abnormal regulation or expression of the 
signal transduction machinery can lead to aberrant and 
malignant growth of the abnormally regulated cells. 
Abnormal expression of Eph is known to be associated with 
carcinomas of the liver, lung, breast and colon, for 

30 example. Likewise, since some Eph-re*lated tyrosine kinases 
are, at least, found within the same tissues as Eph, their 
abnormal expression may also lead to the development of the 
carcinomas described above as well as other types of 
cancers. For example, increased Cek8 activity was found in 

35 embryonal carcinoma cells and a keratinocyte tumor cell 
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line (see Example II) . Additionally, cancers of the 
neuronal linage are likely to be caused by the abnormal 
expression or regulation of an Eph-related kinase such as 
Cek8 (see Example II) or Cek5 + since this Eph-related kinase 
5 is found exclusively in neuronal tissues. Cek5*, Cek5 and 
the other Eph-related kinases expressed in the nervous 
system also are likely to be involved in nerve 
regeneration. 

The important role that these receptor tyrosine 
10 kinases play in cellular processes can be advantageously 
used to diagnose early stages of cancer within a cell 
sample or tissue. A change in the amount or activity of an 
Eph-related kinase in a suspected sample, compared to a 
normal sample, will be indicative of cancerous stages and 
15 of their level of malignancy. Depending on whether the 
normal state is caused by the presence or absence of an 
Eph-related kinase, the change can involve either an 
increase or decrease in the amount or activity of the Eph- 
related kinase. For example, Cek8 activity is increased in 
20 various tumor cells (see Example II) . Thus, increased 
activity of an Eph-related kinase of the invention such as 
Cek8 (SEQ ID NO: 6) can be useful for identifying the 
presence of transformed cells such as occur in a cancer. 

One skilled in the art can measure the level or 
25 activity of an Eph-related kinase, for example, in a tissue 
sample obtained from a subject suspected of having a cancer 
or a developmental abnormality and the level or activity of 
the Eph-related kinase can be compared to the level or 
activity known to be present in a normal sample. Such a 
30 known level of activity can be determined by obtaining a 
significant number of tissue samples from subjects that do 
not have a cancer or a developmental abnormality and 
measuring the levels or activities of an Eph-related kinase 
in the population of samples. Methods for determining the 
35 level or activity of Eph-related kinases are known to the 
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skilled artisan and include, for example, RNA and protein 
blot analysis, EL ISA using specific antibodies to each of 
the Eph- related kinases and direct measurement of catalytic 
activity such as tyrosine kinase activity. Such methods 
5 are described in detail in Example II or are otherwise 
known in the art (see, for example, Harlow et al., 
Antibodies: A Laboratory Manual Cold Spring Harbor 
Laboratory (1988), which is incorporated herein by 
reference) . 

10 The following examples are intended to 

illustrate, but not limit the invention. 

EXAMPLE I 

Isolation and Characterization of 
Eph-Related Tyrosine Kinases 

15 This example shows the cloning and sequencing of 

the Eph-related kinases Cek6 through CeklO*. Structural 
characteristics and patterns of expression are also 
described. 

To find novel members of the Eph family, various 
20 cDNA probes were used at different stringencies to screen 
a 10 day embryonic library as well as a 13 day embryonic 
brain cDNA library. The probes were derived from Cek4 (SEQ 
ID NO: 15) or Cek5 (SEQ ID NO: 17), which had been 
previously isolated based on phosphotyrosine content . 
25 Following subcloning and sequence analysis, it was found 
that the newly isolated cDNA clones encoded seven different 
Eph-related tyrosine kinases. Their isolation and 
structure are described below. 

Briefly, a 10-day chicken embryo Xgtll cDNA 
30 library (Clontech) and a 13 -day embryonic brain Xgtll cDNA 
library were used to isolate the cDNA clones. Screening 
was performed at different stringencies using the following 
procedure. Plaques were transferred to nylon membranes 
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(Micron Separations Inc.) on duplicate filters and 
hybridized to the appropriate probes at one of two 
stringencies (50% formamide, 42°C; or 50% formamide, 37°C) . 
Conditions used were those recommended by the manufacturer 
5 and probes were detected using a nonradioactive DNA 
labeling and detection method (Boehringer Mannheim) . 
Plaques identified as positive were subjected to three 
rounds of purification prior . to DNA extraction using 
Lambda -TRAP (Clontech) . Inserts from recombinant lambda 
10 DNA were subcloned in pBluescript vectors (Stratagene, San 
Diego, CA) using standard procedures and the sequences were 
analyzed on both strands, using the dideoxynucleotide 
chain-termination technique with Sequenase (United States 
Biochemical, Cleveland, OH) . 

15 Several clones distinguishable over known Eph 

tyrosine kinases were isolated using the Cek5 probe, which 
corresponded to nucleotides 495-3223 (Pasquale, supra, 
(1991)). The clones include: one Cek5* cDNA clone (from 
the chick embryo library) ; three Cek6 clones (two from the 

20 embryonic brain and one from the chick embryo library) ; one 
Cek7 clone (from the chick embryo library) ; one Cek7* clone 
(from the chick embryo library) ; one Cek7' clone (from the 
embryonic brain library) ; one Cek9 clone (from the chick 
embryo library) ; one CeklO* clone (from the chick embryo 

25 library) and two CeklO or CeklO* clones, which are 
indistinguishable because they do not encode the 
juxtamembrane domain, (one from the chick embryo and one 
from the embryonic brain library) . 

A Cek4 probe (corresponding to nucleotides 748- 
30 1756; see Sajjadi et al . , supra, 1991), on the other hand, 
was used to isolate one Cek8 clone (from the chick embryo 
library) . Also, following its initial isolation, a CeklO 
probe, corresponding to residues 400-596 in Figure 2, was 
used to isolate clones extending further into the 5' end 
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from the chick embryo library. Of the two clones isolated, 
one represented CeklO and one CeklO*. 



The above -identified Eph- related kinases were 
characterized in terms of tissue distribution and 
5 expression by RNA blot analysis. Poly-A* RNA was prepared 
from chicken tissues using the procedure of Badley et al . , 
Biotechniques 6:114-116 (1988), which is incorporated 
herein by reference. Poly-A + RNA (4-5 fig) was size- 
fractionated alongside RNA molecular weight markers on 0.9% 

10 agarose gels containing formaldehyde (Sambrook et al . , 
Molecular Cloning; A Laboratory Manual (Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor, NY, 1989) , which is 
incorporated herein be reference) and transferred to 
nitrocellulose filters (Schleicher & Schuell) according to 

15 methods known to one skilled in the art. The membranes 
were prehybridized for 2 hours and then hybridized under 
stringent conditions (50% formamide, 5x SSPE, 5x Denhardt's 
reagent, 0.5% SDS, 100 /ig/ml salmon testes DNA, 42°C) . 
Probes were labeled with 32 P dATP by the random-primed 

20 method of Feinberg and Vogelstein, Anal . .Biochem. 132:6-13 
(1983), which is incorporated herein by reference. T4 
polynucleotide kinase was used to label the 5' end of the 
Cek5* specific oligonucleotide (Sambrook et al . , supra, 
1989). Filters were washed to a final stringency of O.lx 

25 SSPE, 0.1% SDS at 58°C pxior to exposure to Kodak XAR-5 X- 
ray film. For autoradiography of 0-actin controls, 
intensifying screens were typically omitted and exposure 
time was reduced to 2 hours. 

The following cDNA probes were used for RNA blot 
30 analysis: Cek4, 1.2 kb, same probe used for the library 
screening described previously, hybridizes to the region 
encoding amino acid residues 240-575; Cek5 probe, 1.2 kb, 
hybridizes to the 3' untranslated region; Cek6 5' probe, 
1.3 kb, hybridizes to amino acid residues 1-438; Cek6 3' 
35 probe, 0.6 kb, hybridizes to the region following amino 
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acid 844; Cek7 5' probe, 0.4 kb, hybridizes to amino acid 
residues 1-136; Cek7 3' probe, 2.0 kb, hybridizes to the 
region following amino acid 137, including the 3' 
untranslated region; Cek8 probe, 1.2 kb, hybridizes to the 
5 region encoding amino acid residues 1-406; Cek9 probe, 0.6 
kb, hybridizes to the region encoding amino acid residues 
1-208; CeklO probe, 0.6 kb, hybridizes to the region 
encoding the 10 C-terminal amino acids and to about 600 
nucleotides of 3' untranslated region. For Cek6 and Cek7, 
10 the 3' Cek6 probe and the 5' Cek7 probe were used for the 
embryonic tissues mRNAs and a mixture of 5' and 3' probes 
for the adult tissues mRNAs. 

Polyadenylated RNA was isolated from a number of 
adult chick tissues, as well as from brain and body tissues 

15 of 10-day embryos. These RNAs were then used for RNA blot 
analysis using the above specific probes. Probes were 
designed to minimize the possibility of cross -hybridization 
among the related kinases. Chicken /3-actin DNA was used as 
a control probe (Cleveland et al . , Cell 20:95-105 (1980), 

20 which is incorporated herein by reference) . 

The amino acid sequence of Cek4 (SEQ ID NO: 16) 
is 67% identical to that of Cek5 (SEQ ID NO: 18) in the 
catalytic and C-terminal regions and is most closely 
related to that of Cek7 (SEQ ID NO: 4) (75% amino acid 

25 identity in the same regions) (Figure 1) . Preliminary data 
had indicated that Cek4 was highly expressed in the chicken 
developing brain and embryonic tissues, but no information 
was obtained on the adult pattern of expression in the 
chick. These data were therefore included in Figure 2. 

30 The 7.5 kb Cek4 transcript previously described was 
confirmed to be abundant in 10 day embryonic tissues. 
Expression was pronounced in the adult brain and retina, 
and lower but detectable in all other adult tissues 
examined, except the liver. In addition to the major 7.5 

35 kb transcript, a smaller Cek4 transcript (of about 5 kb) 



• 
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was found to be expressed at lower levels in the adult 
brain. 

The Cek6 amino acid sequence (SEQ ID NO: 2) is 
most closely related to that of rat Elk (96% identity in 
5 the catalytic and C- terminal regions) . Of the Cek members 
of the Eph subclass, Cek6 is most closely related to Cek5 
(SEQ ID NO: 18) and CeklO (SEQ ID NO: 10) (82% amino acid 
identity with both, in the catalytic and C- terminal 
regions) (Figure 1) . The two Cek6 cDNAs that were isolated 

10 from a 13 -day chick embryo brain library were identical and 
both encoded a protein with a deletion of 32 amino acids 
and an insertion of 19 amino acids in the extracellular 
region (Figure 1) . However, these may be cloning 
artifacts, particularly the deletion, since it causes a 

15 shift in the reading frame and the premature termination of 
the encoded protein. A 4.4 kb Cek6 transcript was found to 
be expressed at high levels in the 10-days embryo and in 
adult brain, lung, heart and skeletal muscle (Figure 2) . 
Low levels of Cek6 expression were detected in all other 

20 adult tissues tested. A second larger Cek6 transcript of 
about 6.5 kb was detected at low levels in the adult brain. 



The amino acid sequence of Cek7 (SEQ ID NO: 4) is 
71% identical to that of Cek5 (SEQ ID NO: 18) in the 
catalytic and C-terminal regions and is most closely 

25 related to those of Cek4 (SEQ ID NO: 16) and Cek9 (SEQ ID 
NO: 8) (75% amino acid identity with both, in the same 
regions) (Figure 1) . A variant form of Cek7, containing a 
22 amino acid insertion in the juxtamembrane domain (Figure 
1) also was isolated and designated Cek7*. Cek7 (SEQ ID NO: 

30 4) and Cek7 + (SEQ ID NO: 20) may originate from the same 
gene by alternative splicing. A second variant form of 
Cek7, designated Cek7' (SEQ ID NO: 22) , which also 
presumably originates via alternative splicing, differs 
from Cek7 in the C-terminal 33 amino acids. Cek7 appears 

35 to have the lowest levels of expression among all the Eph 
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related kinases examined. Three different transcripts of 
about 4.4 kb, 7 kb and 8.5 kb were detected in the 10-day 
embryonic brain. Expression was weaker in the rest of the 
10-day embryo, where only the 4.4 kb transcript could be 
5 detected (Figure 2) . Cek7 transcripts were not detected in 
the adult tissues, except for a barely detectable 8.5 kb 
transcript in the brain (Figure 2) . 

Cek8 (SEQ ID NO: 6) is equally related to Cek5 
(SEQ ID NO: 18), Cek6 (SEQ ID NO: 2), Cek7 (SEQ ID NO: 4) 

10 and CeklO (SEQ ID NO: 10) (74% amino acid identity in the 
catalytic and C-terminal regions) (Figure 1) . A single 6 
kb Cek8 transcript was found to be present in both the 10- 
day embryonic brain and body tissues (Figure 2) . Cek8 (SEQ 
ID NO: 6) expression appears to be the highest in adult 

15 brain and retina and is also detectable in kidney, lung, 
skeletal muscle and thymus (Figure 2; see, also, Example 
II) . Cek8 expression was not detected in heart and liver. 

Cek9 (SEQ ID NO: 8) is most closely related to 
Cek5 (SEQ ID NO: 18) (77% identity at the amino acid level 

20 in the catalytic and C-terminal regions (Figure 1) . A 4.4 
kb Cek9 transcript is present in embryonic brain and body 
tissues. Two additional and very minor transcripts of 
about 5.5 kb and 6.5 kb were detected exclusively in the 
10 -day embryonic brain (Figure 2) . Among the adult tissues 

25 examined, Cek9 expression is prominent in the thymus and 
detectable in brairi, retina, kidney, lung and heart. None 
of the other kinases examined displays such an elevated 
level of expression in the thymus. Cek9 expression was not 
detected in skeletal muscle and liver. 

30 CeklO (SEQ ID NO: 10) is most closely related to 

Cek5 (SEQ ID NO: 18) and Cek6 (SEQ ID NO.: 2) (84% amino 
acid identity with both in the catalytic and C-terminal 
regions) (Figure 1). A variant form of CeklO, containing 
a 15 amino acid insertion in the juxtamembrane domain 
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(Figure 1), was also isolated and designated CeklO* (SEQ ID 
NO: 14) . CeklO and CeklO* may originate from the same gene 
by alternative splicing. Northern blot analysis identified 
two CeklO transcripts of about 4.4 kb and 6 kb, present at 
5 different relative levels in 10-day embryonic brain and 
body tissues as well as in a number of adult tissues 
(Figure 2) . Among the adult tissues examined, CeklO 
expression was particularly prominent in the kidney. Lower 
CeklO expression was detected in the lung and barely 
10 detectable transcripts were also present in brain, liver, 
heart, skeletal muscle and thymus. 

A variant form of Cek5, containing a 16 amino 
acid insertion in the juxtamembrane domain, was also 
identified and termed Cek5 + (SEQ ID NO: 12) (Figure 1) . 

15 This Cek5 variant may originate as a result of alternative 
splicing. With a Cek5 DNA probe recognizing both Cek5 and 
Cek5 + (see Material and Methods), a 4.4 kb transcript was 
detected in both 10-day embryonic brain and body tissues 
(Figure 3, lanes 1 and 3). In addition, a much larger 

20 transcript (of about 10 kb) was detected in the 10-day 
embryonic brain (Figure 3, lane 3). Consistently with the 
previously reported expression of the Cek5 protein, Cek5 
transcripts are more abundant in the brain than in other 
10-day embryonic tissues. Using a probe corresponding to 

25 the 16 amino insertion in the juxtamembrane domain (Figure 
3, lanes 2 and 4), Cek5 + was found to be exclusively 
expressed in the CNS and only as the 4.4 kb transcript. 
Because Cek5 immunoreactivity in the CNS has been 
previously found to be confined to neurons, Cek5 + appears to 

3 0 be a neuronal specific variant of Cek5 . 

Polyclonal antibodies recognizing specifically 
Cek4, Cek8 and Cek9 have been obtained and will be used for 
the characterization of these kinases (see Example II) . 
Peptides corresponding to the carboxy- terminal ends of 
35 Cek4, Cek8 and Cek9 were coupled to bovine serum albumin 
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(BSA) with m-maleimido benzoyl -N-hydroxysuccinimide ester 
(Cek4) or with glutaraldehyde (Cek8 and Cek9) and used as 
immunogens. The peptides used were the following: Cek4, 
CLETHTKNS PVPV (SEQ ID NO 24) ; Cek8, KMQQMHGRMVPV (SEQ ID NO 
5 25) and Cek9, KVHLNQLEPVEV (SEQ ID NO 26). The carboxy- 
terminal regions were chosen because they are poorly 
conserved within the Eph subclass, increasing the 
likelihood of obtaining antibodies specific for each 
kinase. 

0 The antibodies were purified from the antiserum 

by affinity- chromatography on the appropriate peptides 
coupled to N-hydroxy-succinimide-activated agarose 
(BioRad) . As shown in Figure 4 , after affinity 

purification the antibodies to Cek4, Cek8 and Cek9 

5 recognize a single band of the expected apparent molecular 
mass (about 120 kiloDalton, kDa) in membranes -containing 
fractions isolated from 10-day embryonic brain, but not in 
fractions containing soluble proteins. These antibodies do 
not cross-react significantly with related members of the 

0 Eph subclass (not shown) and can be used for different 
applications such as immunoblotting, immunofluorescence 
microscopy and immunoprecipitation (see Figure 4) . All of 
the antibodies are capable of immunoprecipitating the 
kinases from tissue extracts and, as expected, the 

5 immunoprecipitated kinases undergo in vitro 
aut ©phosphorylation in the presence of ATP (see Example 
ID . 

These techniques will allow the characterization 
of the kinases of the Eph subclass at the protein level . 
0 Coupled to a solid support, the antibodies can also be used 
to purify the kinases from tissues and cell lines. In the 
cases tested, antibodies generated to the chicken Eph- 
related kinases recognize the corresponding mammalian 
homologues. Thus, these antibodies could be used, for 
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example, to screen tumor samples for the presence of the 
appropriate Eph- related kinases. 

EXAMPLE II 

CHARACTERIZATION OF CEK8 

5 This example describes structural and functional 

characteristics of the Cek8 protein (SEQ ID NO: 6), 
including the expression and activity of Cek8 during 
development and in tumor cells. 

A. Antibody preparation: 

10 Cek8 expression and activity was examined using 

immunological and immunohistochemical methods. An antigen 
for raising anti-Cek8 antibodies was prepared by coupling 
the peptide KMQQMHGRMVPV (SEQ ID NO: 25) , which consists of 
the eleven carboxy terminal amino acids of Cek8, including 

15 an additional N- terminal lysine, to BSA using 
glutaraldehyde (Harlow and Lane, supra, 1988) . An antigen 
for raising anti-Cek4 antibodies was prepared by coupling 
the peptide CLETHTKNS P VPV (SEQ ID NO: 24), which 
corresponds to the 12 carboxy terminal amino acids of Cek4, 

20 including an additional cysteine at the N-terrainus, to BSA 
using m- ma leimidobenzoyl-N- hydroxy succininmide ester 
(Harlow and Lane, supra, 1988) . Anti-Cek5 antibodies and 
anti-phosphotyrosine antibodies were prepared as described 
by Pasquale, supra, (1991) . Antisera were raised in 

25 rabbits using standard methods (see, for example, Harlow 
and Lane, supra, 1988) . The peptide antigen was coupled to 
N-hydroxy-succinimide-activated agarose and specific 
antisera were affinity purified. 

B. Structural characterization of Cek8 : 

30 Cek8 was immunoprecipitated and examined by 

immunoblotting as described in Section C.I., below. The 
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affinity purified anti-Cek8 antibodies recognized a protein 
having an apparent molecular mass of about 120 kDa, which 
was the expected size for Cek8. The calculated molecular 
. mass of Cek8, however, is less than the 120 kDa observed by 
5 SDS-PAGE. Since Cek8 contains three consensus sites of N- 
linked glycosylation, Cek8 was examined for such 
glycosylation. When chicken embryo fibroblasts were grown 
in the presence of 1.6 /ig/ml tunicamycin, which inhibits N- 
linked glycosylation, the apparent molecular mass of Cek8 
10 decreased by about 10 kDa. 



In order to characterize the carbohydrate moiety 
of Cek8, lectin affinity chromatography was performed. Ten 
day embryonic chicken brains were sonicated in 10 ml PBS 
containing protease inhibitors (protease inhibitors are 1 

15 mM phenylmethylsulfonyl fluoride, 0.2 trypsin inhibitor 
units aprotinin/ml, 10 /xg/ml pepstatin and 10 fig/ml 
leupeptin and 1 mM sodium orthovanadate, a phosphatase 
inhibitor. The sonicated material was centrifuged at 2000 
x g for 5 min to remove insoluble material, then the 

20 supernatant was centrifuged at 200,000 x g for 40 min. 

The pellet, which contained the membrane enriched 
fraction, was solubilized in PBS containing 0.1% Triton X- 
100. The solubilized sample was centrifuged 5 min in a 
microfuge and the supernatant was collected. The extract 

25 was dialyzed overnight at 4 °C against 10 mM Tris-HCl, pH 
7.4, loaded onto various lectin columns, including 
concanavalin A, lentil lectin, wheat germ agglutinin, ricin 
I lectin, peanut lectin or Ulex europaeus I lectin {EY 
Laboratories, Inc.; San Mateo CA) , and the columns were 

30 eluted with 0.1 M methyl a-D-mannopyranoside , 0.1 M D- 
mannose, 0.1 M N- acetyl -D-glucosamine, 0.1 M a- lactose, 0.1 
M or-lactose or 0.05 M a-L-fucose, respectively. Fractions 
were collected and analyzed by immunoblotting for the 
presence of Cek8 as described below. 
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Cek8 bound to the concanavalin A, lentil lectin, 
ricin I and wheat germ agglutinin columns, and was eluted 
with the appropriate buffers. These lectins preferentially 
recognize N-linked sugar chains. Thus, this result is in 
5 agreement with the observed inhibition of glycosylation by 
tunicamycin. In contrast, Cek8 does not bind to peanut 
lectin, which primarily recognizes O-linked chains. 

Binding to concanavalin A and elution with the 
relatively low concentration of 0.1 M methyl a-D- 

10 mannopyranoside indicates that Cek8 contains biantennary 
complex type sugar chains (Osawa and Tsuji, Ann . Rev . 
Biochem. 56:21-42 (1987)). Binding to lentil lectin 
indicates that a fucose residue is present on the innermost 
W-acetylglucosamine residue in an oligosaccharide core. 

15 However, since Cek8 does not bind with £71 ex europaeus I 
lectin, terminal fucose residues are not likely present 
(Sugii and Kabat, Carb. Res. 99:99-101 (1982)) . Binding of 
Cek8 to wheat germ agglutinin indicates that sialic acid is 
present and binding to the ricin I column indicates that 

20 terminal S-galactosyl residues are present in complex sugar 
chains. These carbohydrate structures likely are located 
in the extracellular regions of Cek8 and can participate in 
interactions with extracellular molecules. Jn vivo 
phosphorylation on tyrosine of Cek8 can be achieved by 

25 exposing cells expressing Cek8 to wheat germ agglutinin. 

C. Expression and Catalytic Activity of CekB: 

This section describes the methods for 
determining Cek8 expression and activity in various tissues 
during development and in tumor cells. 

30 1 . Methods 

Cek8 expression and activity were determined by 
immunoprecipitation and immunoblot experiments. Cells from 
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90% confluent tissue culture plates were washed 3x with ice 
cold phosphate buffered saline (PBS) , collected in cold 
RIPA buffer (150 mM sodium chloride / 10 mM sodium 
phosphate, pH 7.2, 1% deoxycholate , 1% Triton X-100, 0.1% 
5 SDS) containing protease inhibitors, and lysed by 
sonication. Phospho tyrosine was added to a final 

concentration of 8 mM when immunoblotting was performed 
using anti-phosphotyrosine antibodies. 

Tissues were removed from adult chickens or 
10 chicken embryos and sonicated in PBS containing protease 
inhibitors. Whole embryos were collected and sonicated in 
PBS. Lysates were stored at -70 °C. Protein 
concentrations were determined using a Bio-Rad protein 
assay (Bio-Rad Laboratories; Richmond CA) . For 
15 immunoprecipi tat ions, tissue extracts were diluted in RIPA 
buffer. Cell lysates and tissue extracts in RIPA buffer 
were precleared using Staph A (Boehringer- Mannheim; 
Indianapolis IN) as described by Pasquale (supra. 1991) . 
The samples then were incubated 40 min with 20 pg anti- 
20 Cek4, anti-Cek5 or anti-Cek8 antibodies or 20 pg control 
rabbit IgG preabsorbed to 20 pi Staph A. The amount of 
antibody was selected to ensure that all of the antigen in 
the extracts or lysates was precipitated. 

Immunoprecipitated material was washed 3x with 
25 RIPA buffer and Ix with PBS. Sample buffer was added, the 
immunoprecipitates were boiled for 5 min, separated by SDS- 
PAGB on 7.5% gels and transferred to nitrocellulose as 
described by Towbin et al., Proc. Nat l. Acad. Sci., USA 
76:4350-4354 (1979), which is incorporated herein by 
30 reference. Following transfer, the filters were incubated 
overnight in Tris-hydroxyethylaminoethane -buffered saline 
(TBS) containing 3% BSA, then incubated 4 hr in 3% BSA 
containing 3 pg/ml anti-Cek4, anti-Cek5, anti-Cek8 or anti- 
phosphotyrosine antibody. The filters were rinsed with 
35 TBS, then incubated for 1 hr with 0.2 pg /ml protein A 
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peroxidase (Sigma; St. Louis MO) in TBS containing 3% BSA. 
The filters were rinsed several times with TBS and 
developed using enhanced chemiluminescence reagents 
(Amersham; Arlington Heights IL) . In some experiments, 
5 after detection, the filters were dried for a few hours, 
then incubated in 3% BSA in TBS and probed with a different 
antibody. 

In vitro phosphorylation was performed as 
described by Pasquale, supra, (1991) . Briefly, Cek8 was 

10 immunoprecipitated from 10 day embryonic brain extracts or 
from cell lysates. In control experiments, Cek5 was 
immunoprecipitated. Immunoprecipi tat ions were performed as 
described above. The immune complexes were incubated for 
30 min at 37 °C in phosphorylation buffer (25 mM N-2- 

15 hydroxyethylpiperazine-N ' -2- ethane sulfonic acid, pH 7.5, 
10 mM MgCl 2 , 10 mM MnCl 2 , 1 mM sodium orthovanadate, 0.1% 
Triton X-100, 150 /xM ATP) . Sample buffer was added and 
electrophoresis and transfer to nitrocellulose were 
performed as described above. Following transfer, the 

20 filters were incubated overnight in 3% BSA in TBS, then 
incubated 4 hr in 3% BSA containing 3 /ig/ml anti- 
phosphotyrosine antibodies . 

2^ Cek8 expression and activity during development 

In whole embryo extracts, Cek8 expression was 
25 detectable at embryonic day 3, increased gradually between 
embryonic days 3 and 5, then remained relatively constant 
through embryonic day 10, which was the last timepoint 
examined. Cek8 was phosphorylated on tyrosine in vivo at 
a low level in 10 day embryonic brain. In addition, Cek8 
30 underwent aut ©phosphorylation on tyrosine in vitro in the 
presence of ATP and divalent metal ions. 

Cek8 expression also was examined in various 
tissues of 10 day chicken embryos. Cek8 was most abundant 
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in the brain and retina, was expressed at substantial 
levels in thigh, gizzard and lungs and at lower levels in 
intestine, liver, lens and heart. Cek8 was not detectable 
in blood. 

5 , The developmental regulation of Cek8 expression 

was examined in greater detail in cerebrum, cerebellum, 
retina and thigh. In the cerebrum, Cek8 expression is low 
at embryonic day 6, then gradually increases to a maximal 
level at embryonic days 16 to 20. Cek8 expression is low, 

10 but detectable, in adult cerebrum. In contrast, expression 
in the cerebellum is low at embryonic day 12 and barely 
detectable at later stages of development. In thigh 
muscle, Cek8 expression is highest at embryonic day 7, then 
decreases to barely detectable levels by day 13 , before 

15 terminal skeletal muscle differentiation occurs. In the 
retina, Cek8 expression remains relatively constant from 
embryonic day 8 until hatching. 

Cek 8 expression also was examined by 
immunoperoxidase staining in chicken embryo frozen tissue 

20 sections. Embryos were removed from eggs, fixed in 4% 
formaldehyde, 0.1 mM sodium orthovanadate in PBS for 16 to 
24 hr, then cryoprotected in 20% sucrose in PBS, 0.1 mM 
sodium orthovanadate for 24 hr. Embryos were embedded in 
OCT compound (Miles Inc.; Tarrytown NY), then frozen in dry 

25 ice/2 -methylbutane . Ten /an cryostat sections were 
collected on glass slides and stored at -70 °C. 

The sections were treated with 0.3% hydrogen 
peroxide for 10 min, then blocked with 3% BSA or normal 
goat or horse serum in PBS for 30 min. Sections were 
30 incubated with rabbit anti-Cek8 antibodies (10-20 /ig/ml) or 
mouse anti-200 kDa neurofilament protein antibodies (1 
jig/ml; Boehringer Mannheim; Indianapolis IN) in a 1:50 
dilution of normal goat serum or horse serum for 3 0 min. 
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Controls were performed using anti-Cek8 antibodies that 
were preincubated with the antigen. 



Following incubation with the primary antibody, 
the sections were rinsed with PBS and incubated with 
5 biotinylated goat anti-rabbit or horse anti-mouse IgG 
(Vector Labs; Burlingame CA) . After additional washes with 
PBS, the sections were incubated with an avidin-biotin- 
peroxidase complex or with an avidin-biotin-alkaline 
phosphatase complex (Vector Labs) . Following several 

10 washes in PBS, peroxidase or alkaline phosphatase were 
visualized using the appropriate substrate kit (Vector 
Labs) . The sections then were rinsed in PBS, air dried, 
mounted in Permount and sealed with a #1 coverslip. 
Specimens were photographed with a Zeiss 405M inverted 

15 microscope . 

Cek8 immunoreactivity was intense in the spinal 
cord and the spinal nerves. Localization of Cek8 in the 
spinal nerves was similar to that of a 200 kDa 
neurofilament protein. At embryonic day 6, Cek8 expression 
20 was restricted to the ventral portions of the spinal 
nerves, which contain axons of motor neurons. 

The results of these experiments indicate that 
Cek8 is expressed early in development. In general, Cek8 
expression is lower early in embryogenesis than at later 

25 stages. Cek8 is differentially regulated in different 
tissues during development and expression is highest in the 
nervous system but also occurs in non-neuronal tissues. In 
view of these results, aberrant Cek8 expression or 
expression of an aberrant Cek8 protein can affect 

30 development by causing defective signal transduction 
throughout an organism. 
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3. Cek8 expression and activity in tumor cells 

Protein tyrosine kinase activity is tightly 
regulated in normal tissues and, in the tissues described 
above, Cek8 was phosphorylated on tyrosine at a low .level. 
5 It is well known that uncontrolled tyrosine kinase activity 
can lead to neoplastic transformation (Bishop, J.M., Cell 
64:234-248 (1991)). Therefore, the expression and 
activation of Cek8 in a number of tumor cell lines was 
examined . 

10 Because of the predominant expression of Cek8 in 

the brain and retina, Cek8 expression and activity was 
determined in a number of cell lines, B50, B49, B35, B28 
and B23, which were derived from CNS system (CNS) tumors 
(Schubert et al., Nature 249:224-227 (1974), which is 

15 incorporated herein by reference) . B35 and B50 cells have 
neuronal properties and both expressed Cek8 . However, Cek8 
is substantially phosphorylated on tyrosine only in B50 
cells (Figures 5. A. and 5.B.). B28 and B49 cells, which 
display glial characteristics, both expressed a moderate 

20 level of Cek8 that is phosphorylated on tyrosine. B23 
cells did not have detectable levels of Cek8 . 

The highest level of Cek8 expression was found in 
undifferentiated P19 embryonal carcinoma cells (McBurney 
and Rogers, J. Devel . Biol. 89:503-508 (1982), which is 

25 incorporated herein by reference) and in HaCaT 
keratinocytes (Boukamp et al . , J. Cell Biol. 106:761-770 
(1988) , which is incorporated herein by reference) . In 
both of these cell lines, Cek8 was phosphorylated on 
tyrosine. Furthermore, comparable levels of Cek8 

30 expression were observed in normal and Rous sarcoma virus- 
transformed chicken embryo fibroblasts. However, Cek8 was 
substantially phosphorylated on tyrosine only in the 
transformed cells (Figures 6.C. and 6.D.) . In addition, in 
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LMH cells, which were derived from a hepatocellular 
carcinoma (Kawaguchi et al., Cane. Res. 47:4460-4464 
(1987) , which is incorporated herein by reference) , Cek8 is 
highly phosphorylated on tyrosine as compared to adult or 
5 embryonic liver (Figures 6. A. and 6.B.). 

For comparison, Cek5 expression and activation 
also was examined in the CNS tumor-derived cell lines. 
Cek5 was immunoprecipitated using anti-Cek5 antibodies 
followed by immunoblotting with anti-phosphotyrosine 

10 antibodies. Cek 5 was expressed in all of the cell lines 
derived from tumors of the CNS , with the highest expression 
in the B35 cells and the B49 cells. Tyrosine 
phosphorylation of Cek5 was observed in B28, B49 and B50 
cells (Figures 5.C. and 5.D.). Cek5 also was highly 

15 expressed and phosphorylated in P19 cells and HaCaT cells. 
Thus, Cek5 expression and activation is similar, but not 
identical, to Cek8 expression and activation in tumor 
cells. 

4-i Effect o f tyrosine phosphorylation on Cek8 kinase 

20 activity 

The effect of tyrosine phosphorylation on the in* 
vitro catalytic activity of Cek8 also was examined. In 
vivo substrates of Cek8 have not yet been identified. 
Therefore, a fusion protein consisting of the C-terminal 
25 117 amino acids of Cek4 fused to £-galactosidase was used 
as an exogenous substrate. The fusion protein was purified 
from bacterial extracts by SDS-PAGE and eluted from the 
gel. 

Assays were performed by incubating 1 fig fusion 
30 protein substrate with Cek8 immunoprecipitate . Cek8 was 
immunoprecipitated from 10 day chick embryonic brain using 
10 fig anti-Cek8 antibodies and 5 /il Staph A and was 
complexed to the antibodies and Staph A when the substrate 
was added. In some cases, Cek8 was phosphorylated for 1 hr 
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using the in vitro kinase reaction described above, in 
order to obtain Cek8 in a highly tyrosine -phosphory la ted 
form. The fusion protein substrate then was added and 
phosphorylation of the substrate was allowed to proceed for 
5 1 rain at 0 °C or 37 °C in the phosphorylation buffer 
described above containing 200 /zM ATP. In parallel 
experiments, Cek8 that was not phosphory la ted in vitro was 
used in the assay. 

Following incubation, the samples were 
10 centrifuged briefly in a raicrofuge, 100 /xl 1% SDS in PBS 
was added to the pellets and the samples were heated at 95 
°C for 5 min. The samples were centrifuged for 3 min and 
the supernatants were transferred to tubes containing anti- 
S-galactosidase antibodies bound to Staph A beads in 900 jil 
15 RIPA buffer lacking SDS. Immunoprecipitation was performed 
as described above and the extent of tyrosine 
phosphorylation of the immunoprecipitated fusion protein 
substrate was analyzed by immunoblotting using anti- 
phosphotyrosine antibodies . 

20 As shown in Figure 6.E., the phosphorylated form 

of Cek8 produced a greater amount of phosphorylation of the 
substrate on tyrosine at both 0 °C or 37 °C. These results 
indicate that activation of Cek8 by tyrosine 
phosphorylation increases the kinase activity of Cek8. 

25 Although the invention has been described with 

reference to the disclosed embodiments, it should be 
understood that various modifications can be made without 
departing from the spirit of the invention. Accordingly, 
the invention is limited only by the following claims. 
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CA GAA ACC CTG ATG GAC ACA CGG ACA GCG ACG GCT GAG CTG GGC TGG 47 
Glu Thr Leu Met Asp Thr Arg Thr Ala Thr Ala Glu Leu Gly Trp 
15 10 15 



ACT GCC AAC CCT CCG TCA GGG TGG GAA GAA GTG AGT GGC TAC GAC GAG 
Thr Ala Asn Pro Pro Ser Gly Trp Glu Glu Val Ser Gly Tyr Asp Glu 
20 25 30 
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AAC CTG AAC ACC ATC CGT ACC TAC CAG GTG TGC AAC GTC TTC GAG CCA 143 
Asn Leu Asn Thr lie Arg Thr Tyr Gin Val Cys Asn Val Phe Glu Pro 
35 40 45 

AAC CAG AAC AAC TGG CTC CTC ACC ACC TTC ATC AAC CGG CGC GGA GCC 191 
Asn Gin Asn Asn Trp Leu Leu Thr Thr Phe lie Asn Arg Arg Gly Ala 
50 55 60 

CAC CGC ATC TAC ACT GAG ATG CGC TTC ACT GTG CGG GAC TGC AGC AGC 239 
His Arg lie Tyr Thr Glu Met Arg Phe Thr Val Arg Asp Cys Ser Ser 
65 70 75 

CTC CCC AAC GTC CCC GGC TCC TGC AAG GAG ACC TTC AAC CTC TAC TAC 287 
Leu Pro Asn Val Pro Gly Ser Cys Lys Glu Thr Phe Asn Leu Tyr Tyr 
80 85 90 95 

TAT GAG ACA GAC TCT GTC ATT GCC ACT AAG AAG TCG GCC TTC TGG ACG 335 
Tyr Glu Thr Asp Ser Val lie Ala Thr Lys Lys Ser Ala Phe Trp Thr 
100 105 110 

GAG GCA CCC TAC CTC AAA GTG GAC ACC ATT GCT GCT GAC GAG AGC TTT 383 
Glu Ala Pro Tyr Leu Lys Val Asp Thr lie Ala Ala Asp Glu Ser Phe 
115 . 120 125 

TCC CAG GTG GAC TTT GGT GGC AGG TTG ATG AAG GGT T TTC TTC AAG 429 
Ser Gin Val Asp Phe Gly Gly Arg Leu Met Lys Gly Phe Phe Lys 
130 135 140 

AAG TGC CCA AGC GTG GTG CAG AAC TTC GCT ATC TTC CCT GAG ACG ATG 477 
Lys Cys Pro Ser Val Val Gin Asn Phe Ala lie Phe Pro Glu Thr Met 
145 150 155 

ACG GGG GCA' GAG AGC ACC TCT CTG GTG ACA GCA CGG GGC ACC TGC ATC 525 
Thr Gly Ala Glu Ser Thr Ser Leu Val Thr Ala Arg Gly Thr Cys lie 
160 165 170 

CCC AAC GCT GAG GAG GTG GAC GTG CCC ATC AAG CTG TAC TGC AAC GGG 573 
Pro Asn Ala Glu Glu Val Asp Val Pro lie Lys Leu Tyr Cys Asn Gly 
175 180 185 190 

GAT GGG GAG TGG ATG GTA CCC ATA GGT CGC TGC ACC TGC AAG GCT GGT 621 
Asp Gly Glu Trp Met Val Pro lie Gly Arg Cys Thr Cys Lys Ala Gly 
195 200 205 

TAT GAG CCG GAA AAC AAC GTG GCT TGC AGA GCC TGC CCG GCT GGG ACA 669 
Tyr Glu Pro Glu Asn Asn Val Ala Cys Arg Ala Cys Pro Ala Gly Thr 
210 215 220 

TTC AAA GCC AGT CAG GGT GCG GGG CTG TGT GCC CGC TGT CCC CCC AAC 717 
Phe Lys Ala Ser Gin Gly Ala Gly Leu Cys Ala Arg Cys Pro Pro Asn 
225 230 235 

AGC CGC TCC AGC GCC GAG GCC TCA CCG CTC TGC GCC TGC CGC AAC GGC 765 
Ser Arg Ser Ser Ala Glu Ala Ser Pro Leu Cys Ala Cys Arg Asn Gly 
240 245 250 

TAC TTT CGG GCT GAC CTG GAC CCA CCG ACA GCT GCC TGC ACC AGC GTC 813 
Tyr Phe Arg Ala Asp Leu Asp Pro Pro Thr Ala Ala Cys Thr Ser Val 
255 260 265 * 270 

CCC TCT GGT CCA CGC AAC GTC ATC TCC ATT GTC AAT GAG ACC TCC ATC 861 
Pro Ser Gly Pro Arg Asn Val lie Ser He Val Asn Glu Thr Ser He 
275 280 285 

ATC CTG GAG TGG AAC CCG CCA CGG GAG ACA GGA GGC CGG GAT GAT GTC 909 
He Leu Glu Trp Asn Pro Pro Arg Glu Thr Gly Gly Arg Asp Asp Val 
290 295 300 
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ACT TAC AAC ATT GTC TGC AAO AAG TGC CGG GCA 6AC CGG CGT GCC TGC 957 
Thr Tyr Asn He Val Cys Lys Lys Cys Arg Ala Asp Arg Arg Ala Cys 
305 310 315 

TCC CGC TGC GAC GAC AAC GTG GAG TTT GTG CCC CGA CAG CTG GGG CTG 1005 
Ser Arg Cys Asp Asp Asn Val Glu Phe Val Pro Arg Gin Leu Gly Leu 
320 325 330 

ACA GAG ACC CGC GTC TTC ATC AGC AGC CTC TGG GCA CAC ACA CCC TAC 1053 
Thr Glu Thr Arg Val Phe He Ser Ser Leu Trp Ala His Thr Pro Tyr 
335 340 345 350 

ACC TTT GAG ATC CAG GCG GTC AAC GGG GTT TCC AAC AAG AGC CCC TTC 1101 
Thr Phe Glu He Gin Ala Val Asn Gly Val Ser Asn Lys Ser Pro Phe 
355 360 365 

CCA CCC CAG CAC GTC TCC GTG AAC ATC ACC ACC AAC CAA GCT GCA CCC 1149 
Pro Pro Gin His Val Ser Val Asn He Thr Thr Asn Gin Ala Ala Pro 
370 375 380 

TCC ACT GTC CCC ATC ATG CAC CAG GTG AGT GCC ACC ATG AGG AGC ATC 1197 
Ser Thr Val Pro He Met His Gin Val Ser Ala Thr Met Arg Ser He 
385 390 395 

ACG CTA TCC TGG CCG CAG CCG GAG CAG CCC AAC GGC ATC ATC CTG GAC 1245 
Thr Leu Ser Trp Pro Gin Pro Glu Gin Pro Asn Gly He He Leu Asp 
400 405 410 

TAC GAG CTG CGC TAC TAC GAG AAG CTG AGC CGC ATC TGC ACG CCC GAT 1293 
Tyr Glu Leu Arg Tyr Tyr Glu Lys Leu Ser Arg He Cys Thr Pro Asp 
415 420 425 430 

GTC AGC GGC ACT GTG GGC TCG AGA CCG GCG GCG GAC CAC AAC GAG TAC 1341 
Val Ser Gly Thr Val Gly Ser Arg Pro Ala Ala Asp His Asn Glu Tyr 
435 440 445 

AAC TCC TCT GTG GCC CGC AGT CAG ACC AAC ACG GCC CGG CTG GAG GGG 1389 
Asn Ser Ser Val Ala Arg Ser Gin Thr Asn Thr Ala Arg Leu Glu Gly 
450 455 460 

CTG CGC CCT GGC ATG GTG TAC GTG GTG CAG GTG CGA GCA AGG ACG GTG 1437 
Leu Arg Pro Gly Met Val Tyr Val Val Gin Val Arg Ala Arg Thr Val 
465 470 475 

GCC GGC TAT GGG AAG TAC AGT GGG AAG ATG TGC TTC CAG ACA CTG ACC 1485 
Ala Gly Tyr Gly Lys Tyr Ser Gly Lys Met Cys Phe Gin Thr Leu Thr 
480 485 490 

GAT GAT GAC TAC AAG TCT GAG CTG AGG GAG CAG CTG CCA TTG ATT GCG 1533 
Asp Asp Asp Tyr Lys Ser Glu Leu Arg Glu Gin Leu Pro Leu He Ala 
495 500 505 510 

GGG TCT GCA GCG GCC GGC GTG GTC TTC ATT GTT TCG CTG 'GTG GCC ATT 1581 
Gly Ser Ala Ala Ala Gly Val Val Phe He Val Ser Leu Val Ala He 
515 520 525 

TCC ATA GTG TGC AGC AGG AAG CGA GCG TAC AGC AAG GAG GTC GTT TAC 1629 
Ser He Val Cys Ser Arg Lys Arg Ala Tyr Ser Lys Glu Val Val Tyr 
530 535 540 

AGC GAT AAG CTG CAG CAC TAC AGC ACC GGG AGA GGG TCT CCG GGA ATG 1677 
Ser Asp Lys Leu Gin His Tyr Ser Thr Gly Arg Gly Ser Pro Gly Met 
545 550 555 

AAG ATT TAC ATC GAC CCC TTC ACT TAT GAG GAC CCC AAC GAG GCA GTG 1725 
Lys lie Tyr He Asp Pro Phe Thr Tyr Glu Asp Pro Asn Glu Ala Val 
560 565 570 
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CGT GAG TTC GCC AAG GAG ATT GAC GTC TCC TTT GTG AAG ATT GAA GAG 1773 
Arg Glu Phe Ala Lys Glu lie Asp Val Ser Phe Val Lys lie Glu Glu 
575 580 585 590 

GTC ATT GGA GCA GGG GAG TTT GGA GAG GTG TAC AAA GGC CGC CTG AAG 1821 
Val He Gly Ala Gly Glu Phe Gly Glu Val Tyr Lys Gly Arg Leu Lys 
595 600 605 

TTG CCT GGC AAG CGG GAG ATC TAT GTG GCC ATC AAA ACA CTG AAG GCT 1869 
Leu Pro Gly Lys Arg Glu He Tyr Val Ala He Lys Thr Leu Lys Ala 
610 615 620 

♦ 

GGC TAC TCA GAG AAG CAG CGC CGG GAT TTC CTG AGC GAA GCC AGC ATC 1917 
Gly Tyr Ser Glu Lys Gin Arg Arg Asp Phe Leu Ser Glu Ala Ser He 
625 630 635 

ATG GGG CAG TTT GAC CAC CCC AAC ATC ATC CGG CTG GAA GGG GTG GTG 1965 
Met Gly Gin Phe Asp His Pro Asn He He Arg Leu Glu Gly Val Val 
640 645 650 

ACC AAG AGC CGA CCA GTC ATG ATT ATC ACA GAG TTC ATG GAG AAT GGG 2013 
Thr Lys Ser Arg Pro Val Met He He Thr Glu Phe Met Glu Asn Gly 
655 660 665 670 

GCC CTG GAC TCG TTC CTG CGG CAA AAT GAT GGG CAG TTC ACA GTG ATC 2061 
Ala Leu Asp Ser Phe Leu Arg Gin Asn Asp Gly Gin Phe Thr Val He 
675 680 685 

CAG CTG GTG GGG ATG CTC AGA GGG ATT GCT GCT GGG ATG AAG TAC CTG 2109 
Gin Leu Val Gly Met Leu Arg Gly He Ala Ala Gly Met Lys Tyr Leu 
690 695 700 

GCA GAG ATG AAC TAT GTC CAC AGG GAT CTG GCG GCC AGG AAC ATT CTG 2157 
Ala Glu Met Asn Tyr Val His Arg Asp Leu Ala Ala Arg Asn He Leu 
705 710 715 

GTC AAC AGC AAC CTG GTG TGC AAA GTG TCA GAC TTT GGC CTC TCG CGC 2205 
Val Asn Ser Asn Leu Val Cys Lys Val Ser Asp Phe Gly Leu Ser Arg 
720 725 730 

TAC CTG CAG GAC GAC ACC TCT GAT CCC ACC TAC ACC AGC TCC TTG GGT 2253 
Tyr Leu Gin Asp Asp Thr Ser Asp Pro Thr Tyr Thr Ser Ser Leu Gly 
735 740 745 750 

GGG AAG ATC CCT GTG CGA TGG ACA GCA CCA GAG GCC ATT GCG TAC CGC 2301 
Gly Lys He Pro Val Arg Trp Thr Ala Pro Glu Ala He Ala Tyr Arg 
755 760 765 

AAG TTC ACG TCA GCC AGT GAC GTC TGG AGC TAT GGC ATC GTC ATG TGG 2349 
Lys Phe Thr Ser Ala Ser Asp Val Trp Ser Tyr Gly He Val Met Trp 
770 775 780 

GAG GTG ATG TCG TTC GGA GAG AGG CCC TAC TGG GAC ATG TCC AAC CAG 2397 
Glu Val Met Ser Phe Gly Glu Arg Pro Tyr Trp Asp Met Ser Asn Gin 
785 790 795 

GAC GTC ATC AAT GCC ATC GAG CAG GAC TAC CGG CTC CCG CCG CCC ATG 2445 
Asp Val He Asn Ala He Glu Gin Asp Tyr Arg Leu Pro Pro Pro Met 
800 805 810 

GAC TGC CCA GCT GCC CTG CAC CAA CTG ATG CTG GAC TGC TGG CAG AAG 2493 
Asp Cys Pro Ala Ala Leu His Gin Leu Met Leu Asp Cys Trp Gin Lys 
815 820 825 * 830 

GAC CGC AAC ACC CGG CCT CGC TTG GCC GAG ATT GTC AAC ACC CTG GAC 2541 
Asp Arg Asn Thr Arg Pro Arg Leu Ala Glu He Val Asn Thr Leu Asp 
835 840 845 
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AAA ATG ATC CGC AAC CCG GCA AGC CTC AAA ACT GTG GCT ACC ATC ACC 2589 
Lys Met lie Arg Asn Pro Ala Ser Leu Lys Thr Val Ala Thr He Thr 
850 855 860 

GCT GTG CCT TCT CAG CCC CTC CTC GAC CGC TCT ATC CCT GAT TTC ACT 2637 
Ala Val Pro Ser Gin Pro Leu Leu Asp Arg Ser He Pro Asp Phe Thr 
865 870 875 

GCC TTT ACC TCA GTA GAA GAC TGG CTG AGT GCC GTC AAG ATG AGC CAG 2685 
Ala Phe Thr Ser Val Glu Asp Trp Leu Ser. Ala Val Lys Met Ser Gin 
880 885 890 

TAT AGA GAC AAC TTC CTG AGC GCT GGA TTC ACC TCC CTC CAG CTG GTC 2733 
Tyr Arg Asp Asn Phe Leu Ser Ala Gly Phe Thr Ser Leu Gin Leu Val 
895 900 905 910 

GCC CAG ATG ACA TCT GAA GAC CTC CTG AGA ATA GGA GTA ACG CTG GCT 2781 
Ala Gin Met Thr Ser Glu Asp Leu Leu Arg He Gly Val Thr Leu Ala 
915 920 925 

GGG CAC CAG AAG AAG ATC CTG AAC AGC ATC CAG TCC ATG CGC GTG CAG 2829 
Gly His Gin Lys Lys He Leu Asn Ser He Gin Ser Met Arg Val Gin 
930 935 940 

ATG AGT CAG TCT CCG ACC TCG ATG GCGTGACGTC CCTCGCTCGA CGAGGAGGGG 2883 
Met Ser Gin Ser Pro Thr Ser Met Ala 
945 950 

GACGGGGAGG GCAGGTGGCA GAGGTGGGAG GGGAGGAACT GATCTGATGG GAGCCGTGGG 2943 

GCCGCAGCTG GAGAGGGGCA GCCACGGCCG GGGCTGTGCC TGACCGCGGA GGACGTTCCT 3003 

GGGACTCGCC TCGGCCTGGT GACTTCCATC CCTCACCAAC AGAAGCACAC TTACCGATGT 3063 

CACGGGGGAC AGCGTATAAA TAAGTATAAA TATGTACAAA TCATATATTT AAAAAAAAAA 3123 

AAAAAAAAAG 3X33 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 951 amino acids 

(B) TYPE: amino acid 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Glu Thr Leu Met Asp Thr Arg Thr Ala Thr Ala Glu Leu Gly Trp Thr 
1 5 10 15 • 

Ala Asn Pro Pro Ser Gly Trp Glu Glu Val Ser Gly Tyr Asp Glu Asn 
20 25 30 

Leu Asn Thr He Arg Thr Tyr Gin Val Cys Asn Val Phe Glu Pro Asn 
35 40 45 

Gin Asn Asn Trp Leu Leu Thr Thr Phe He Asn Arg Arg Gly Ala His 
50 55 60 

Arg He Tyr Thr Glu Met Arg Phe Thr Val Arg Asp Cys Ser Ser Leu 
65 70 75 80 

Pro Asn Val Pro Gly Ser Cys Lys Glu Thr Phe Asn Leu Tyr Tyr Tyr 
85 90 95 
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Glu Thr Asp Ser Val He Ala Thr Lys Lys Ser Ala Phe Trp Thr Glu 
100 105 110 

Ala Pro Tyr Leu Lys Val Asp Thr He Ala Ala Asp Glu Ser Phe Ser 
115 120 125 

Gin Val Asp Phe Gly Gly Arg Leu Met Lys Gly Phe Phe Lys Lys Cys 
130 135 140 

Pro Ser Val Val Gin Asn Phe Ala He Phe Pro Glu Thr Met Thr Gly 
145 150 155 160 

Ala Glu Ser Thr Ser Leu Val Thr Ala Arg Gly Thr Cys He Pro Asn 
165 170 175 

Ala Glu Glu Val Asp Val Pro He Lys Leu Tyr Cys Asn Gly Asp Gly 
180 185 190 

Glu Trp Met Val Pro He Gly Arg Cys Thr Cys Lys Ala Gly Tyr Glu 
195 200 * 205 

Pro Glu Asn Asn Val Ala Cys Arg Ala Cys Pro Ala Gly Thr Phe Lys 
210 215 220 

Ala Ser Gin Gly Ala Gly Leu Cys Ala Arg Cys Pro Pro Asn Ser Arg 
225 230 235 240 

Ser Ser Ala Glu Ala Ser Pro Leu Cys Ala Cys Arg Asn Gly Tyr Phe 
245 250 255 

Arg Ala Asp Leu Asp Pro Pro Thr Ala Ala Cys Thr Ser Val Pro Ser 
260 265 270 

Gly Pro Arg Asn Val He Ser He Val Asn Glu Thr Ser He He Leu 
275 280 285 

Glu Trp Asn Pro Pro Arg Glu Thr Gly Gly Arg Asp Asp Val Thr Tyr 
290 295 300 

Asn He Val Cys Lys Lys Cys Arg Ala Asp Arg Arg Ala Cys Ser Arg 
305 310 315 320 

Cys Asp Asp Asn Val Glu Phe Val Pro Arg Gin Leu Gly Leu Thr Glu 
325 330 335 

Thr Arg Val Phe He Ser Ser Leu Trp Ala His Thr Pro Tyr Thr Phe 
340 345 350 

Glu He Gin Ala Val Asn Gly Val Ser Asn Lys Ser Pro Phe Pro Pro 
355 360 * 365 

Gin His Val Ser' Val Asn He Thr Thr Asn Gin Ala Ala Pro Ser Thr 
370 375 380 

Val Pro He Met His Gin Val Ser Ala Thr Met Arg Ser He Thr Leu 
385 390 395 400 

Ser Trp Pro Gin Pro Glu Gin Pro Asn Gly He He Leu Asp Tyr Glu 
405 410 415 

Leu Arg Tyr Tyr Glu Lys Leu Ser Arg He Cys Thr Pro Asp Val Ser 
420 425 430 

Gly Thr Val Gly Ser Arg Pro Ala Ala Asp His Asn Glu Tyr Asn Ser 



435 



440 



445 
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Ser Val Ala Arg Ser Gin Thr Asn Thr Ala Arg Leu Glu Gly Leu Arg 
450 455 460 

Pro Gly Met Val Tyr Val Val Gin Val Arg Ala Arg Thr Val Ala Gly 
465 470 475 480 

Tyr Gly Lys Tyr Ser Gly Lys Met Cys Phe Gin Thr Leu Thr Asp Asp 
485 490 495 

Asp Tyr Lys Ser Glu Leu Arg Glu Gin Leu Pro Leu lie Ala Gly Ser 
500 505 510 

Ala Ala Ala Gly Val Val Phe lie Val Ser Leu Val Ala He Ser He 
515 520 525 

Val Cys Ser Arg Lys Arg Ala Tyr Ser Lys Glu Val Val Tyr Ser Asp 
530 535 540 

Lys Leu Gin His Tyr Ser Thr Gly Arg Gly Ser Pro Gly Met Lys He 
545 550 555 560 

Tyr He Asp Pro Phe Thr Tyr Glu Asp Pro Asn Glu Ala Val Arg Glu 
565 570 575 

Phe Ala Lys Glu He Asp Val Ser Phe Val Lys He Glu Glu Val He 
580 585 590 

Gly Ala Gly Glu Phe Gly Glu Val Tyr Lys Gly Arg Leu Lys Leu Pro 
595 600 605 

Gly Lys Arg Glu He Tyr Val Ala He Lys Thr Leu Lys Ala Gly Tyr 
610 615 620 

Ser Glu Lys Gin Arg Arg Asp Phe Leu Ser Glu Ala Ser He Met Gly 
625 630 635 640 

Gin Phe Asp His Pro Asn He He Arg Leu Glu Gly Val Val Thr Lys 
645 650 655 

Ser Arg Pro Val Met He He Thr Glu Phe Met Glu Asn Gly Ala Leu 
660 665 670 

Asp Ser Phe Leu Arg Gin Asn Asp Gly Gin Phe Thr Val He Gin Leu 
675 680 685 

Val Gly Met Leu Arg Gly He Ala Ala Gly Met Lys Tyr Leu Ala Glu 
690 695 700 

Met Asn Tyr Val His Arg Asp Leu Ala Ala Arg Asn He Leu Val Asn 
705 710 715 720 

Ser Asn Leu Val Cys Lys Val Ser Asp Phe Gly Leu Ser Arg Tyr Leu 
725 730 735 

Gin Asp Asp Thr Ser Asp Pro Thr Tyr Thr Ser Ser Leu Gly Gly Lys 
740 745 750 

He Pro Val Arg Trp Thr Ala Pro Glu Ala He Ala Tyr Arg Lys Phe 
755 760 765 

Thr Ser Ala Ser Asp Val Trp Ser Tyr Gly He Val Met Trp Glu Val 
770 775 780 

Met Ser Phe Gly Glu Arg Pro Tyr Trp Asp Met Ser Asn Gin Asp Val 



785 



790 



795 



800 
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He Asn Ala He Glu Gin Asp Tyr Arg Leu Pro Pro Pro Met Asp Cys 
805 810 815 

Pro Ala Ala Leu His Gin Leu Met Leu Asp Cys Trp Gin Lys Asp Arg 
820 825 830 

Asn Thr Arg Pro Arg Leu Ala Glu He Val Asn Thr Leu Asp Lys Met 
835 840 845 

He Arg Asn Pro Ala Ser Leu Lys Thr Val Ala Thr He Thr Ala Val 
850 855 860 

Pro Ser Gin Pro Leu Leu Asp Arg Ser He Pro Asp Phe Thr Ala Phe 
865 870 875 880 

Thr Ser Val Glu Asp Trp Leu Ser Ala Val Lys Met Ser Gin Tyr Arg 
885 890 895 

Asp Asn Phe Leu Ser Ala Gly Phe Thr Ser Leu Gin Leu Val Ala Gin 
900 905 910 

Met Thr Ser Glu Asp Leu Leu Arg He Gly Val Thr Leu Ala Gly His 
915 920 925 

Gin Lys Lys He Leu Asn Ser He Gin Ser Met Arg Val Gin Met Ser 
930 935 940 

Gin Ser Pro Thr Ser Met Ala 
945 950 

(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3059 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: both 

(D) TOPOLOGY: linear 

<ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 2.. 2167 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO : 3 : 

C CTC AAA TTC ACC CTG AGG GAC TGT AAC AGC CTT CCA GGA GGA CTT 46 
Leu Lys Phe Thr Leu Arg Asp Cys Asn Ser Leu Pro Gly Gly Leu 
1 5 10 15 

GGG ACT TGC AAG GAG ACT TTT AAC ATG TAC TAC TTT GAG TCA GAT GAT 94 
Gly Thr Cys Lys Glu Thr Phe Asn Met Tyr Tyr Phe Glu Ser Asp Asp 
20 25 30 

GAA GAT GGG AGG AAC ATC AGA GAG AAT CAG TAC ATC AAG ATA GAT ACC 142 
Glu Asp Gly Arg Asn He Arg Glu Asn Gin Tyr He Lys He Asp Thr 
35 40 45 

ATT GCT GCT GAT GAG AGC TTC ACG GAG TTG GAC CTC GGC GAC AGA GTT 190 
He Ala Ala Asp Glu Ser Phe Thr Glu Leu Asp Leu Gly Asp Arg Val 
50 55 60 



ATG AAG TTA AAC ACA GAA GTG AGA GAT GTT GGG CCT CTA ACA AAA AAA 
Met Lys Leu Asn Thr Glu Val Arg Asp Val Gly Pro Leu Thr Lys Lys 
65 70 75 



238 
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GGA TTT TAC CTT GCT TTC CAG GAT GTG GGC GCC TGC ATT GCC CTG GTC 286 
Gly Phe Tyr Leu Ala Phe Gin Asp Val Gly Ala Cys He Ala Leu Val 
80 85 90 95 

TCT GTG CGT GTG TAC TAC AAG AAA TGC CCA TCA GTG ATC CGC AAC CTG 334 
Ser Val Arg Val Tyr Tyr Lys Lys Cys Pro Ser Val He Arg Asn Leu 
100 105 110 

GCA CGC TTT CCA GAT ACC ATC ACA GGA GCA GAT TCC TCG CAG CTG CTA 382 
Ala Arg Phe Pro Asp Thr He Thr Gly Ala Asp Ser Ser Gin Leu Leu 
115 120 125 

GAA GTG TCA GGC GTC TGT GTC AAC CAC TCA GTG ACT GAT GAG GCA CCA 430 
Glu Val Ser Gly Val Cys Val Asn His Ser Val Thr Asp Glu Ala Pro 
130 135 140 

AAG ATG CAC TGC AGT TCA GAG GGA GAA TGG CTG GTG CCC ATT GGG AAG 478 
Lys Met His Cys Ser Ser Glu Gly Glu Trp Leu Val Pro He Gly Lys 
145 150 * 155 

TGT TTG TGC AAG GCA GGG TAC GAG GAG AAG AAC AAC ACC TGC CAA GCA 526 
Cys Leu Cys Lys Ala Gly Tyr Glu Glu Lys Asn Asn Thr Cys Gin Ala 
160 165 170 175 

CCT TCT CCA GTC AGT AGT GTG AAA AAA GGG AAG ATA ACT AAA AAT AGC 574 
Pro Ser Pro Val Ser Ser Val Lys Lys Gly Lys He Thr Lys Asn Ser 
180 185 190 

ATC TCC CTT TCC TGG CAG GAG CCA GAT CGA CCC AAC GGC ATC ATC CTG 622 
lie Ser Leu Ser Trp Gin Glu Pro Asp Arg Pro Asn Gly He He Leu 
195 200 205 

GAA TAC GAA ATC AAA TAT TTT GAA AAG GAC CAG GAG ACA AGC TAC ACC 670 
Glu Tyr Glu He Lys Tyr Phe Glu Lys Asp Gin Glu Thr Ser Tyr Thr 
210 215 220 

ATC ATC AAA TCC AAA' GAG ACC GCA ATT ACG GCA GAT GGC TTG AAA CCA 718 
He He Lys Ser Lys Glu Thr Ala He Thr Ala Asp Gly Leu Lys Pro 
225 230 235 

GGC TCA GCG TAC GTC TTC CAG ATC CGA GCC CGG ACA GCT GCT GGC TAC 766 
Gly Ser Ala Tyr Val Phe Gin He Arg Ala Arg Thr Ala Ala Gly Tyr 
240 245 250 255 

GGT GGC TTC AGT CGA AGA TTT GAG TTT GAA ACC AGC CCA GTG TTA GCT 814 
Gly Gly Phe Ser Arg Arg Phe Glu Phe Glu Thr Ser Pro Val Leu Ala 
260 265 270 

GCA TCC AGT GAC CAG AGC CAG ATT CCT ATA ATT GTT GTG TCT GTA ACA 862 
Ala Ser Ser Asp Gin Ser Gin He Pro He He Val Val Ser Val Thr 
275 280 285 

GTG GGA GTT ATT CTG CTG GCT GTT GTT ATC GGT TTC CTT CTC AGT GGA 910 
Val Gly Val He Leu Leu Ala Val Val He Gly Phe Leu Leu Ser Gly 
290 295 300 

AGG CGC TGT GGC TAC AGC AAG GCT AAA CAA GAC CCA GAA GAA GAA AAG 958 
Arg Arg Cys Gly Tyr Ser Lys Ala Lys Gin Asp Pro Glu Glu Glu Lys 
305 310 315 

ATG CAT TTT CAT AAT GGC CAC ATT AAA CTG CCT GGT GTA AGA ACC TAC 1006 
Met His Phe His Asn Gly His He Lys Leu Pro Gly Val Arg Thr Tyr 
320 325 330 335 



ATT GAT CCC CAC ACC TAT GAG GAC CCT AAT CAA GCT GTC CAC GAG TTT 1054 
He Asp Pro His Thr Tyr Glu Asp Pro Asn Gin Ala Val His Glu Phe 
340 345 350 
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GCC AAG GAA ATA GAA GCT TCG TGC ATA ACC ATC GAG AGA GTT ATC GGA 1102 
Ala Lys Glu He Glu Ala Ser Cys lie Thr He Glu Arg Val He Gly 
355 360 365 

GCT GGT GAA TTT GGA GAA GTC TGC AGT GGA CGG CTG AAA CTG CAG GGA 1150 
Ala Gly Glu Phe Gly Glu Val Cys Ser Gly Arg Leu Lys Leu Gin Gly 
370 375 380 

AAA CGC GAG TTT CCA GTG GCT ATC AAA ACC CTG AAG GTG GGC TAC ACA 1198 
Lys Arg Glu Phe Pro Val Ala He Lys Thr Leu Lys Val Gly Tyr Thr 
385 390 395 

GAG AAG CAA AGG CGA GAT TTC CTG GGA GAA GCG AGC ATC ATG GGG CAG 1246 
Glu Lys Gin Arg Arg Asp Phe Leu Gly Glu Ala Ser He Met Gly Gin 
400 ~ 405 410 415 

TTC GAC CAC CCC AAC ATC ATC CAC CTG GAA GGT GTC GTC ACA AAA AGC 1294 
Phe Asp His Pro Asn He He His Leu Glu Gly Val Val Thr Lys Ser 
420 425 430 

AAA CCT GTA ATG ATA GTA ACG GAA TAC ATG GAA AAT GGT TCT CTG GAT 1342 
Lys Pro Val Met He Val Thr Glu Tyr Met Glu Asn Gly Ser . Leu Asp 
435 440 445 

ACA TTT TTA AAG AAG AAC GAT GGG CAG TTC ACG GTC ATT CAG CTG GTC 1390 
Thr Phe Leu Lys Lys Asn Asp Gly Gin Phe Thr Val He Gin Leu Val 
450 455 460 

GGG ATG CTG CGA GGC ATC GCA TCA GGG ATG AAG TAC CTG TCT GAC ATG 1438 
Gly Met Leu Arg Gly He Ala Ser Gly Met Lys Tyr Leu Ser Asp Met 
465 470 475 

GGT TAC GTA CAC AGA GAC CTC GCT GCC AGG AAT ATC CTC ATC AAC AGC 1486 
Gly Tyr Val His Arg Asp Leu Ala Ala Arg Asn He Leu He Asn Ser 
480 485 490 495 

AAC TTA GTC TGC AAG GTG TCT GAC TTT GGC CTC TCC AGA GTC CTA GAA 1534 
Asn Leu Val Cys Lys Val Ser Asp Phe Gly Leu Ser Arg Val Leu Glu 
500 505 510 

GAT GAT CCT GAA GCA GCG TAC ACA ACC AGG GGA GGG AAG ATC CCC ATC 1582 
Asp Asp Pro Glu Ala Ala Tyr Thr Thr Arg Gly Gly Lys He Pro He 
515 520 525 

CGA TGG ACG GCA CCT GAA GCA ATC GCC TTC CGC AAA TTC ACG TCG GCC 1630 
Arg Trp Thr Ala Pro Glu Ala He Ala Phe Arg Lys Phe Thr Ser Ala 
530 535 540 

AGC GAT GTG TGG AGC TAC GGC ATT GTG ATG TGG GAA GTG ATG TCC TAT 1678 
Ser Asp Val Trp Ser Tyr Gly He Val Met Trp Glu Val Met Ser Tyr 
545 550 555 

GGC GAG AGA CCT TAC TGG GAA ATG ACA AAC CAA GAT GTG ATT AAA GCC 1726 
Gly Glu Arg Pro Tyr Trp Glu Met Thr Asn Gin Asp Val He Lys Ala 
560 565 570 575 

GTG GAG GAA GGC TAT CGC CTG CCA AGT CCC ATG GAC TGC CCT GCT GCT 1774 
Val Glu Glu Gly Tyr Arg Leu Pro Ser Pro Met Asp Cys Pro Ala Ala 
580 585 " 590 

CTC TAC CAG TTG ATG CTT GAC TGC TGG CAG AAA GAC CGC AAC AGC AGG 1822 
Leu Tyr Gin Leu Met Leu Asp Cys Trp Gin Lys Asp Arg Asn Ser Arg 
595 600 605 

CCC AAG TTT GAT GAA ATT GTC AGC ATG TTG GAC AAG CTC ATC CGT AAC 1870 
Pro Lys Phe Asp Glu He Val Ser Met Leu Asp Lys Leu He Arg Asn 
610 615 620 
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CCA AGC AGC TTG AAG ACG TTG GTT AAT GCA TCG AGC AGA GTA TCA AAT 1918 
Pro Ser Sex Leu Lys Thr Leu Val Asn Ala Ser Sex Arg Val Sex Asn 
625 630 635 

TTG TTG GTA GAA CAC AGT CCA GTG GGG AGC GGT GCC TAC AGG TCA GTG 1966 
Leu Leu Val Glu His Ser Pro Val Gly Ser Gly Ala Tyr Arg Ser Val 
640 645 650 655 

GGT GAG TGG CTG GAA GCC ATC AAA ATG GGT CGA TAC ACC GAG ATT TTC 2014 
Gly Glu Trp Leu Glu Ala lie Lys Met Gly Arg Tyr Thr Glu lie Phe 
660 665 ' 670 

ATG GAG AAT GGA TAC AGT TCG ATG GAT TCT GTG GCT CAG GTG ACC CTA 2062 
Met Glu Asn Gly Tyr Ser Ser Met Asp Ser Val Ala Gin Val Thr Leu 
675 680 685 

GAG GAT TTG AGG CGG CTG GGA GTG ACA CTT GTT GGT CAC CAG AAG AAG 2110 
Glu Asp Leu Arg Arg Leu Gly Val Thr Leu Val Gly His Gin Lys Lys 
690 695 700 

ATA ATG AAC AGC CTT CAA GAG ATG AAG GTC CAG TTG GTG AAT GGG ATG 2158 
lie Met Asn Ser Leu Gin Glu Met Lys Val Gin Leu Val Asn Gly Met 
705 710 715 

GTG CCA TTG TAACTCGGTT TTTAAGTCAC TTCCTCGAGT GGTCGGTCCT 2207 

Val Pro Leu 

720 



GCACTTTGTA 


TACTAG CTCT 


GAGATTTATT 


TTGACTAAAG 


AAGAAAAAAG 


GGAAATTCAG 


2267 


TGGTTTCTGT 


AACTGAAGGA 


CGCTGGCTTC 


TGCCACAGCA 


TTTATAAAGC 


AGTGTTTGAC 


2327 


TGAAGTTTTC 


ATTTTCTTCC 


TATTTGTGTC 


CTCATTCTCA 


TGAAGTAAAT 


GTAACATGCA 


2387 


TGGAACATGG 


AAATGGATCT 


ACTGTACATG 


AGGTTACCCA 


ATTTCTTGCG 


CTTCAGCATG 


2447 


ACAACAGCAA 


GCCTTCCCAC 


CACATGTTGT 


CTATACATGG 


GAGATATATA 


TATATGCATA 


2507 


TATATATATA 


GCACCTTTAT 


ATACTGAATT 


ACAGCAGCAG 


CACATGTTAA 


TACTTCCAAG 


2567 


GACTTACTTG 


ACTAGAGAAG 


TTTTGCAGCC 


ATTGTGGGCT 


CACACAAGCT 


GCGGTTTACT 


2627 


GAAGTTTACT 


TCAAGTCTTA 


CTTGTCTACA 


GAAGTGTATT 


GAAGAGCAAT 


ATGATTAGAT 


2687 


TATTTCTGGA 


TAGATATTTT 


GTTTTGTAAA 


TTTAAAAAAT 


CGTGTTACAC 


AGCGTTAAGT 


2747 


TATAGAGACT 


AGTGTATAAA 


CATGTTGCTT 


GCTCAATGGC 


AAATACAATA 


CAGGGTGTAT 


2807 


AriTTTTTCT 


CTCTGTGTTG 


CAAAGTTCTT 


TTAGTTTGCT 


CTTCTGTGAG 


GATAATACGT 


2867 


TATGATGTAT 


ATACTGTACA 


GTTTGCTACA 


CATCAGGTAC 


AAGATTGGGG 


CTTTCTCAAT 


2927 


GTTTTGTTCT 


TTTTCCCTCT 


TTTGTTTCAT 


TTTGTCTTCC 


TTTTGTGTTA 


ACCACTATGC 


2987 


TTTGTATTTT 


TGCTGCTGTT 


TGGTTTGAGG 


CAACATATAA 


AGCTTTCAGG 


TGTTTTGATT 


3047 


ATAAAAAAAA 


AG 










3059 
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(2) INFORMATION FOR SEQ ID NO:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 722 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 
Leu Lys Phe Thr Leu Arg Asp Cys Asn Ser Leu Pro Gly Gly Leu Gly 
1 5 10 15 

Thr Cys Lys Glu Thr Phe Asn Met Tyr Tyr Phe Glu Ser Asp Asp Glu 
20 25 30 

Asp Gly Arg Asn lie Arg Glu Asn Gin Tyr lie Lys lie Asp Thr lie 
35 40 45 

Ala Ala Asp Glu Ser Phe Thr Glu Leu Asp Leu Gly Asp Arg Val Met 
50 55 60 

Lys Leu Asn Thr Glu Val Arg Asp Val Gly Pro Leu Thr Lys Lys Gly 
65 70 75 80 

Phe Tyr Leu Ala Phe Gin Asp Val Gly Ala Cys He Ala Leu Val Ser 
85 90 95 

Val Arg Val Tyr Tyr Lys Lys Cys Pro Ser Val He Arg Asn Leu Ala 
100 105 110 

Arg Phe Pro Asp Thr He Thr Gly Ala Asp Ser Ser Gin Leu Leu Glu 
115 120 125 

Val Ser Gly Val Cys Val Asn His Ser Val Thr Asp Glu Ala Pro Lys 
130 135 140 

Met His Cys Ser Ser Glu Gly Glu Trp Leu Val Pro He Gly Lys Cys 
145 150 155 160 

Leu Cys Lys Ala Gly Tyr Glu Glu Lys Asn Asn Thr Cys Gin Ala Pro 
165 170 175 

Ser Pro Val Ser Ser Val Lys Lys Gly Lys He Thr Lys Asn Ser He 
180 185 190 

Ser Leu Ser Trp Gin Glu Pro Asp Arg Pro Asn Gly He He Leu Glu 
195 200 ~ 205 

Tyr Glu He Lys Tyr Phe Glu Lys Asp Gin Glu Thr Ser Tyr Thr He 
210 215 220 

He Lys Ser Lys Glu Thr Ala He Thr Ala Asp Gly Leu Lys Pro Gly 
225 230 235 240 

Ser Ala Tyr Val Phe Gin He Arg Ala Arg Thr Ala Ala Gly Tyr Gly 
245 250 255 

Gly Phe Ser Arg Arg Phe Glu Phe Glu Thr Ser Pro Val Leu Ala Ala 
260 265 270 

Ser Ser Asp Gin Ser Gin He Pro He He Val Val Ser Val Thr Val 
275 280 285 

Gly Val He Leu Leu Ala Val Val He Gly Phe Leu Leu Ser Gly Arg 
290 295 300 
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Arg Cys Gly Tyr Ser Lys Ala Lys Gin Asp Pro Glu Glu Glu Lys Met 
305 310 315 320 

His Phe His Asn Gly His lie Lys Leu Pro Gly Val Arg Thr Tyr He 
325 330 335 

Asp Pro His Thr Tyr Glu Asp Pro Asn Gin Ala Val His Glu Phe Ala 
340 345 350 

Lys Glu He Glu Ala Ser Cys He Thr He Glu Arg Val He Gly Ala 
355 360 365 

Gly Glu Phe Gly Glu Val Cys Ser Gly Arg Leu Lys Leu Gin Gly Lys 
370 375 380 

Arg Glu Phe Pro Val Ala He Lys Thr Leu Lys Val Gly Tyr Thr Glu 
385 390 395 400 

Lys Gin Arg Arg Asp Phe Leu Gly Glu Ala Ser He Met Gly Gin Phe 
405 410 415 

Asp His Pro Asn He He His Leu Glu Gly Val Val Thr Lys Ser Lys 
420 425 430 

Pro Val Met He Val Thr Glu Tyr Met Glu Asn Gly Ser Leu Asp Thr 
435 440 445 

Phe Leu Lys Lys Asn Asp Gly Gin Phe Thr Val He Gin Leu Val Gly 
450 455 460 

Met Leu Arg Gly lie Ala Ser Gly Met Lys Tyr Leu Ser Asp Met Gly 
465 470 475 * 480 

Tyr Val His Arg Asp Leu Ala Ala Arg Asn He Leu He Asn Ser Asn 
485 490 495 

Leu Val Cys Lys Val Ser Asp Phe Gly Leu Ser Arg Val Leu Glu Asp 
500 505 510 

Asp Pro Glu Ala Ala Tyr Thr Thr Arg Gly Gly Lys He Pro He Arg 
515 520 525 

Trp Thr Ala Pro Glu Ala He Ala Phe Arg Lys Phe Thr Ser Ala Ser 
530 535 540 

Asp Val Trp Ser Tyr Gly He Val Met Trp Glu Val Met Ser Tyr Gly 
545 550 555 560 

Glu Arg Pro Tyr Trp Glu Met Thr Asn Gin Asp Val He Lys Ala Val 
565 570 575 

Glu Glu Gly Tyr Arg Leu Pro Ser Pro Met Asp Cys Pro Ala Ala Leu 
580 585 590 

Tyr Gin Leu Met Leu Asp Cys Trp Gin Lys Asp Arg Asn Ser Arg Pro 
595 600 605 

Lys Phe Asp Glu He Val Ser Met Leu Asp Lys Leu He Arg Asn Pro 
610 615 620 

Ser Ser Leu Lys Thr Leu Val Asn Ala Ser Ser Arg Val Ser Asn Leu 
625 630 635 640 

Leu Val Glu His Ser Pro Val Gly Ser Gly Ala Tyr Arg Ser Val Gly 



650 
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Glu Trp Leu Glu Ala He Lys Met Gly Arg Tyr Thr Glu He Phe Met 
660 665 670 

Glu Asn Gly Tyr Ser Ser Met Asp Ser Val Ala Gin Val Thr Leu Glu 
675 680 665 

Asp Leu Arg Arg Leu Gly Val Thr Leu Val Gly His Gin Lys Lys He 
690 695 700 

Met Asn Ser Leu Gin Glu Met Lys Val Gin Leu Val Asn Gly Met Val 
705 710 715 720 

Pro Leu 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2820 base pairs 

(B) TYPB: nucleic acid 

(C) STRANDEDNESS : both 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

<B) LOCATION: 2.. 254 8 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

C GGA GAG AGC CAG TTT GCC AAG ATT GAC ACC ATT GCT GCT GAT GAG 46 
Gly Glu Ser Gin Phe Ala Lys He Asp Thr He Ala Ala Asp Glu 
1 5 .10 15 

AGC TTC ACC CAG GTG GAC ATT GGT GAC AGG ATC ATG AAG CTG AAT ACA 94 
Ser Phe Thr Gin Val Asp He Gly Asp Arg He Met Lys Leu Asn Thr 
20 25 30 

GAG GTG CGG GAC GTG GGG CCT CTC AGC AAG AAA GGG TTT TAC TTG GCT 142 
Glu Val Arg Asp Val Gly Pro Leu Ser Lys Lys Gly Phe Tyr Leu Ala 
35 40 45 

TTC CAG GAC GTC GGT GCC TGC ATT GCT TTG GTG TCT GTT CGT GTC TTC 190 
Phe Gin Asp Val Gly Ala Cys He Ala Leu Val Ser Val Arg Val Phe 
50 55 60 

TAT AAG AAG TGC CCA CTG ACA GTT CGA AAC CTG GCA CAG TTT CCA GAC 238 
Tyr Lys Lys Cys Pro Leu Thr Val Arg Asn Leu Ala Gin Phe Pro Asp 
65 70 75 

ACC ATT ACT GGG GCT GAT ACA TCC TCT CTG GTG GAG GTT CGT GGC TCC 266 
Thr He Thr Gly Ala Asp Thr Ser Ser Leu Val Glu Val Arg Gly Ser 
80 85 90 95 

TGT GTC AAC AAC TCG GAA GAG AAG GAC GTG CCA AAA ATG TAC TGC GGG 334 
Cys Val Asn Asn Ser Glu Glu Lys Asp Val Pro Lys Met Tyr Cys Gly 
100 105 no 

GCA GAT GGT GAA TGG CTG GTA CCC ATT GGC AAC TGT CTG TGC AAT GCT 382 
Ala Asp Gly Glu Trp Leu Val Pro He Gly Asn Cys Leu Cys Asn Ala 
115 120 125 



GGC TAT GAA GAA CGC AAT GGT GAA TGC CAA GCT TGC AAA ATC GGA TAC 430 
Gly Tyr Glu Glu Arg Asn Gly Glu Cys Gin Ala Cys Lys He Gly Tyr 
130 135 140 



• 
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TAC AAG GCG CTC TCA ACA GAT GTT GCA TGT GCC AAA TGC CCG CCT CAC 478 
Tyr Lys Ala Leu Ser Thr Asp Val Ala Cys Ala Lys Cys Pro Pro His 
145 150 155 

AGC TAC TCC ATC TGG GAA GGC TCT ACC TCC TGC ACC TGT GAT CGG GGC 526 
Ser Tyr Ser lie Trp Glu Gly Ser Thr Ser Cys Thr Cys Asp Arg Gly 
160 165 170 175 

TTC TTC CGA GCA GAA AAT GAT GCT GCA TCC ATG CCC TGC ACT CGC CCT 574 
Phe Phe Arg Ala Glu Asn Asp Ala Ala Ser Met Pro Cys Thr Arg Pro 
180 185 190 

CCA TCC GCA CCC CAG AAC CTG ATT TCC AAC GTC AAC GAG ACG TCA GTG 622 
Pro Ser Ala Pro Gin Asn Leu lie Ser Asn Val Asn Glu Thr Ser Val 
195 200 205 

AAC TTG GAG TGG AGC GCC CCA CAG AAC AAG GGA GGA CGG GAC GAC ATC 670 
Asn Leu Glu Trp Ser Ala Pro Gin Asn Lys Gly Gly Arg Asp Asp lie 
210 215 220 

TCC TAC AAC GTG GTG TGC AAG CGC TGC GGG GCA GGG GAG CCC AGC CAC 718 
Ser Tyr Asn Val Val Cys Lys Arg Cys Gly Ala Gly Glu Pro Ser His 
225 230 235 

TGC CGG TCC TGT GGC AGT GGT GTA CAT TTC AGC CCC CAG CAG AAC GGG 766 
Cys Arg Ser Cys Gly Ser Gly Val His Phe Ser Pro Gin Gin Asn Gly 
240 245 250 255 

CTG AAA ACC ACG AAG GTT TCC ATC ACT GAC CTC CTG GCA CAC ACC AAC 814 
Leu Lys Thr Thr Lys Val Ser lie Thr Asp Leu Leu Ala His Thr Asn 
260 265 270 

TAC ACC TTT GAG GTC TGG GCA GTG AAT GGA GTG TCC AAG CAC AAC CCC 862 
Tyr Thr Phe Glu Val Trp Ala Val Asn Gly Val Ser Lys His Asn Pro 
275 280 285 

AGC CAG GAC CAA GCT GTG TCG GTC ACT GTG ACA ACT AAC CAA GCA GCT 910 
Ser Gin Asp Gin Ala Val Ser Val Thr Val Thr Thr Asn Gin Ala Ala 
290 295 300 

CCA TCC CCA ATT GCA TTG ATC CAG GCT AAA GAG ATA ACG AGG CAC AGC 958 
Pro Ser Pro He Ala Leu He Gin Ala Lys Glu He Thr Arg His Ser 
305 310 315 

GTT GCC TTG GCC TGG CTG GAA CCT GAC AGG CCC AAT GGA GTC ATC CTG 1006 
Val Ala Leu Ala Trp Leu Glu Pro Asp Arg Pro Asn Gly Val He Leu 
320 325 330 335 

GAG TAC GAA GTC AAG TAC TAC GAA AAG GAC CAA AAC GAG CGC ACG TAT 1054 
Glu Tyr Glu Val Lys Tyr Tyr Glu Lys Asp Gin Asn Glu Arg Thr Tyr 
340 345 350 

CGC ATT GTG AAG ACA GCC TCC AGG AAT ACT GAC ATC AAA GGT TTG AAC 1102 
Arg He Val Lys Thr Ala Ser Arg Asn Thr Asp He Lys Gly Leu Asn 
355 360 365 

CCC CTG ACT TCA TAT GTA TTT CAT GTG £GG GCC AGG ACA GCA GCA GGA 1150 
Pro Leu Thr Ser Tyr Val Phe His Val Arg Ala Arg Thr Ala Ala Gly 
370 375 380 

TAC GGA GAC TTC AGT GGG CCG TTT GAG TTC ACA ACT AAC ACA GTT CCT 1198 
Tyr Gly Asp Phe Ser Gly Pro Phe Glu Phe Thr Thr Asn Thr Val Pro 
385 390 395 

TCC CCC ATC ATT GGC GAT GGT ACC AAT CCC ACA GTG CTG CTT GTT TCA 1246 
Ser Pro He He Gly Asp Gly Thr Asn Pro Thr Val Leu Leu Val Ser 
400 405 410 415 
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GTG GCT GGC AGT GTT GTT CTT GTG GTC ATT CTC ATT GCA GCC TTT GTC 1294 
Val Ala Gly Ser Val Val Leu Val Val lie Leu He Ala Ala Phe Val 
420 425 430 

ATC AGC AGG AGG CGC AGC AAA TAC AGT AAA GCT AAG CAA GAG GCA GAT 1342 
He Ser Arg Arg Arg Ser Lys Tyr Ser Lys Ala Lys Gin Glu Ala Asp 
435 440 445 

GAG GAG AAA CAT TTG AAC CAA GGT GTC AGA ACA TAT GTG GAT CCT TTT 1390 
Glu Glu Lys His Leu Asn Gin Gly Val Arg Thr Tyr Val Asp Pro Phe 
450 455 460 

ACA TAT GAG GAT CCA AAT CAA GCT GTG AGG GAA TTT GCC AAA GAA ATT 1438 
Thr Tyr Glu Asp Pro Asn Gin Ala Val Arg Glu Phe Ala Lys Glu He 
465 470 475 

GAT GCC TCC TGC ATA AAG ATT GAG AAA GTT ATT GGT GTG GGG GAA TTT 1486 
Asp Ala Ser Cys He Lys He Glu Lys Val He Gly Val Gly Glu Phe 
480 485 490 495 

GGT GAA GTA TGC AGT GGA CGT CTC AAA GTT CCA GGA AAA AGA GAA ATC 1534 
Gly Glu Val Cys Ser Gly Arg Leu Lys Val Pro Gly Lys Arg Glu He 
500 505 510 

TGT GTG GCT ATC AAG ACT CTG AAA GCT GGT TAC ACT GAC AAA CAA CGG 1582 
Cys Val Ala He Lys Thr Leu Lys Ala Gly Tyr Thr Asp Lys Gin Arg 
515 520 525 

AGA GAC TTC CTG AGT GAG GCC AGC ATC ATG GGA CAA TTT GAC CAC CCC 1630 
Arg Asp Phe Leu Ser Glu Ala Ser He Met Gly Gin Phe Asp His Pro 
530 535 540 

AAT ATC ATC CAC TTG GAA GGC GTT GTT ACT AAA TGT AAA CCA GTA ATG 1678 
Asn He He His Leu Glu Gly Val Val Thr Lys Cys Lys Pro Val Met 
545 550 555 

ATC ATA ACT GAG TAC ATG GAG AAT GGC TCC TTG GAT GCC TTC CTC CGG 1726 
He He Thr Glu Tyr Met Glu Asn Gly Ser Leu Asp Ala Phe Leu Arg 
560 565 570 575 

AAG AAT GAT GGC AGA TTT ACA GTA ATC CAG TTG GTG GGG ATG CTT CGT 1774 
Lys Asn Asp Gly Arg Phe Thr Val He Gin Leu Val Gly Met Leu Arg 
580 585 590 

GGC ATC GGC TCA GGA ATG AAG TAT CTG TCT GAC ATG AGC TAT GTG CAT 1822 
Gly He Gly Ser Gly Met Lys Tyr Leu Ser Asp Met Ser Tyr Val His 
595 600 605 

CGG GAT CTA GCT GCT CGA AAC ATA CTG GTC AAC AGC AAC TTG GTC TGC 1870 
Arg Asp Leu Ala Ala Arg Asn lie Leu Val Asn Ser Asn Leu Val Cys 
610 615 620 

AAA GTG TCT GAC TTT GGC ATG TCC CGT GTC CTG GAA GAT GAC CCT GAG 1918 
Lys Val Ser Asp Phe Gly Met Ser Arg Val Leu Glu Asp Asp Pro Glu 
625 630 635 

GCA GCT TAT ACC ACA CGG GGT GGC AAG ATC CCT ATC CGA TGG ACT GCA 1966 
Ala Ala Tyr Thr Thr Arg Gly Gly Lys He Pro He Arg Trp Thr Ala 
640 645 650 655 

CCA GAG GCA ATT GCC TAC CGT AAA TTT ACA TCG GCT AGT GAC GTG TGG 2014 
Pro Glu Ala He Ala Tyr Arg Lys Phe Thr Ser Ala Ser Asp Val Trp 
660 665 670 

AGC TAT GGC ATC GTC ATG TGG GAA GTG ATG TCC TAT GGA GAG AGA CCT 2062 
Ser Tyr Gly He Val Met Trp Glu Val Met Ser Tyr Gly Glu Arg Pro 
675 680 685 
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TAC TGG GAT ATG TCC AAT CAA GAC GTT ATT AAA GCC ATT GAG GAA GGG 2110 
Tyr Trp Asp Met Ser .Asn Gin Asp Val lie Lys Ala lie Glu Glu Gly 
690 695 700 

TAT CGG TTG CCA CCC CCA ATG GAC TGC CCC ATT GCT CTC CAT CAG CTG 2158 
Tyr Arg Leu Pro Pro Pro Met Asp Cys Pro lie Ala Leu His Gin Leu 
705 710 715 

ATG TTA GAC TGC TGG CAG AAG GAA CGC AGC GAC AGA CCT AAA TTT GGA 2206 
Met Leu Asp Cys Trp Gin Lys Glu Arg Ser Asp Arg Pro Lys Phe Gly 
720 725 730 735 

CAG ATT GTC AAC ATG CTG GAC AAA CTC ATC CGC AAC CCT AAC AGC CTG 2254 
Gin lie Val Asn Met Leu Asp Lys Leu He Arg Asn Pro Asn Ser Leu 
740 " 745 750 

AAG AGG ACA GGC AGC GAG AGC TCC AGA CCC AGC ACA GCC CTG CTG GAT 2302 
Lys Arg Thr Gly Ser Glu Ser Ser Arg Pro Ser Thr Ala Leu Leu Asp 
755 760 765 

CCC AGC TCC CCG GAG TTC TCG GCG GTT GTT TCT GTC AGT GAC TGG CTC 2350 
Pro Ser Ser Pro Glu Phe Ser Ala Val Val Ser Val Ser Asp Trp Leu 
770 775 780 

CAA GCC ATT AAA ATG GAG CGA TAC AAG GAT AAC TTC ACA GCT GCT GGC 2398 
Gin Ala He Lys Met Glu Arg Tyr Lys Asp Asn Phe Thr Ala Ala Gly 
785 790 795 

TAT ACC ACC CTA GAG GCT GTG GTG CAT ATG AAC CAG GAC GAC CTG GCC 2446 
Tyr Thr Thr Leu Glu Ala Val Val His Met Asn Gin Asp Asp Leu Ala 
800 805 810 815 

AGG ATC GGG ATC ACT GCC ATC ACA CAC CAG AAC AAG ATC TTG AGC AGC 2494 
Arg He Gly He Thr Ala He Thr His Gin Asn Lys He Leu Ser Ser 
820 825 630 

GTT CAA GCC ATG CGC AGC CAA ATG CAA CAG ATG CAC GGC AGG ATG GTG 2542 
Val Gin Ala Met Arg Ser Gin Met Gin Gin Met His Gly Arg Met Val 
835 840 845 

CCC GTC TGAGCCAGTA CTGAATAAAC TCAAAACTCT TGAAATTAGT TTACCTCATC 2598 
Pro Val 

CATGCACTTT AATTGAAGAA CTGCACTTTT TTTACTTCGT CTCCTCGCCC GTTGAAATAA 2658 

AGATCTGCAG CATTGCTTGA TGTACAGATT GTGGAAACCG AGCGTGTGTT GGGAGGGGGG 2718 

CCTCCAGAAA TGACAAGCCG TCATTTTAAA CCAGACCTGG AACAAATTGT TTCTTGGAAC 2778 

ATACTTCTCT GTTGATCAAC GATATGTAAA ATACATGTAT CC 2820 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 849 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Gly Glu Ser Gin Phe Ala Lys He Asp Thr He Ala Ala Asp Glu Ser 
15 10 15 
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Phe Thr Gin Val Asp lie Gly Asp Arg lie Met Lys Leu Asn Thr Glu 
20 25 30 

Val Arg Asp Val Gly Pro Leu Ser Lys Lys Gly Phe Tyr Leu Ala Phe 
35 40 45 

Gin Asp Val Gly Ala Cys lie Ala Leu Val Ser Val Arg Val Phe Tyr 
50 55 60 

Lys Lys Cys Pro Leu Thr Val Arg Asn Leu Ala Gin Phe Pro Asp Thr 
65 70 75 80 

He Thr Gly Ala Asp Thr Ser Ser Leu Val Glu Val Arg Gly Ser Cys 
85 90 95 

Val Asn Asn Ser Glu Glu Lys Asp Val Pro Lys Met Tyr Cys Gly Ala 
100 105 * 110 

Asp Gly Glu Trp Leu Val Pro He Gly Asn Cys Leu Cys Asn Ala Gly 
115 120 125 

Tyr Glu Glu Arg Asn Gly Glu Cys Gin Ala Cys Lys He Gly Tyr Tyr 
130 135 140 

Lys Ala Leu Ser Thr Asp Val Ala Cys Ala Lys Cys Pro Pro His Ser 
145 150 155 ' 160 

Tyr Ser He Trp Glu Gly Ser Thr Ser Cys Thr Cys Asp Arg Gly Phe 
165 170 175 

Phe Arg Ala Glu Asn Asp Ala Ala Ser Met Pro Cys Thr Arg Pro Pro 
180 185 190 

Ser Ala Pro Gin Asn Leu He Ser Asn Val Asn Glu Thr Ser Val Asn 
195 200 205 

Leu Glu Trp Ser Ala Pro Gin Asn Lys Gly Gly Arg Asp Asp He Ser 
210 215 220 

Tyr Asn Val Val Cys Lys Arg Cys Gly Ala Gly Glu Pro Ser His Cys 
225 230 235 240 

Arg Ser Cys Gly Ser Gly Val His Phe Ser Pro Gin Gin Asn Gly Leu 
245 250 255 

Lys Thr Thr Lys Val Ser He Thr Asp Leu Leu Ala His Thr Asn Tyr 
260 265 270 

Thr Phe Glu Val Trp Ala Val Asn Gly Val Ser Lys His Asn Pro Ser 
275 280 285 

Gin Asp Gin Ala Val Ser Val Thr Val Thr Thr Asn Gin Ala Ala Pro 
290 295 300 

Ser Pro He Ala Leu He Gin Ala Lys Glu He Thr Arg His Ser Val 
305 310 315 ~ 320 

Ala Leu Ala Trp Leu Glu Pro Asp Arg Pro Asn Gly Val He Leu Glu 
325 330 " 335 

Tyr Glu Val Lys Tyr Tyr Glu Lys Asp Gin Asn Glu Arg Thr Tyr Arg 
340 345 350 

He Val Lys Thr Ala Ser Arg Asn Thr Asp He Lys Gly Leu Asn Pro 
355 360 365 
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Leu Thr Ser Tyr Val Phe His Val Arg Ala Arg Thr Ala Ala Gly Tyr 
370 375 380 

Gly Asp Phe Ser Gly Pro Phe Glu Phe Thr Thr Asn Thr Val Pro Ser 
385 390 395 400 

Pro lie lie Gly Asp Gly Thr Asn Pro Thr Val Leu Leu Val Ser Val 
405 410 415 

Ala Gly Ser Val Val Leu Val Val lie Leu He Ala Ala Phe Val He 
420 425 430 

Ser Arg Arg Arg Ser Lys Tyr Ser Lys Ala Lys Gin Glu Ala Asp Glu 
435 440 445 

Glu Lys His Leu Asn Gin Gly Val Arg Thr Tyr Val Asp Pro Phe Thr 
450 455 460 

Tyr Glu Asp Pro Asn Gin Ala Val Arg Glu Phe Ala Lys Glu He Asp 
465 470 475 480 

Ala Ser Cys He Lys He Glu Lys Val He Gly Val Gly Glu Phe Gly 
485 490 495 

Glu Val Cys Ser Gly Arg Leu Lys Val Pro Gly Lys Arg Glu He Cys 
500 505 " 510 

Val Ala He Lys Thr Leu Lys Ala Gly Tyr Thr Asp Lys Gin Arg Arg 
515 520 525 

Asp Phe Leu Ser Glu Ala Ser He Met Gly Gin Phe Asp His Pro Asn 
530 535 540 

He He His Leu Glu Gly Val Val Thr Lys Cys Lys Pro Val Met He 
545 550 555 560 

He Thr Glu Tyr Met Glu Asn Gly Ser Leu Asp Ala Phe Leu Arg Lys 
565 570 575 

Asn Asp Gly Arg Phe Thr Val He Gin Leu Val Gly Met Leu Arg Gly 
580 585 590 

He Gly Ser Gly Met Lys Tyr Leu Ser Asp Met Ser Tyr Val His Arg 
595 600 605 

Asp Leu Ala Ala Arg Asn He Leu Val Asn Ser Asn Leu Val Cys Lys 
610 615 620 

Val Ser Asp Phe Gly Met Ser Arg Val Leu Glu Asp Asp Pro Glu Ala 
625 630 635 640 

Ala Tyr Thr Thr Arg Gly Gly Lys lie Pro He Arg Trp Thr Ala Pro 
€45 650 655 

Glu Ala He Ala Tyr Arg Lys Phe Thr Ser Ala Ser Asp Val Trp Ser 
660 665 670 

Tyr Gly He Val Met Trp Glu Val Met Ser Tyr Gly Glu Arg Pro Tyr 
675 680 685 

Trp Asp Met Ser Asn Gin Asp Val He Lys Ala He Glu Glu Gly Tyr 
690 695 700 

Arg Leu Pro Pro Pro Met Asp Cys Pro He Ala Leu His Gin Leu Met 



705 



710 



715 



720 
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Leu Asp Cys Trp Gin Lys Glu Arg Ser Asp Arg Pro Lys Phe Gly Gin 
725 730 735 

lie Val Asn Met Leu Asp Lys Leu lie Arg Asn Pro Asn Ser Leu Lys 
740 745 750 

Arg Thr Gly Ser Glu Ser Ser Arg Pro Ser Thr Ala Leu Leu Asp Pro 
755 760 765 

Ser Ser Pro Glu Phe Ser Ala Val Val Ser Val Ser Asp Trp Leu Gin 
770 775 780 

Ala lie Lys Met Glu Arg Tyr Lys Asp Asn Phe Thr Ala Ala Gly Tyr 
785 790 795 800 

Thr Thr Leu Glu Ala Val Val His Met Asn Gin Asp Asp Leu Ala Arg 
805 810 815 

lie Gly lie Thr Ala He Thr His Gin Asn Lys He Leu Ser Ser Val 
820 825 830 

Gin Ala Met Arg Ser Gin Met Gin Gin Met His Gly Arg Met Val Pro 
835 840 845 

Val 



(2) INFORMATION FOR SEQ ID NO:7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3776 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : both 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 290.. 3208 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

CGGCTCTGAC TTTGTGTTAA CGGTTTATGG ACTGGTTCCA AAGAGCTCAA AGGTACCAAA 60 

ACACTCCAAG CAACCTCTGA ACCATTCAAG CAAGTAGTGT GTGTTTATTG GATATGGTGG 120 

AGTCTACAGA GAATCTTCAT GGATTCTAAT GCTGACATCA GTGCAAGAAG AGTGTCAGGA 180 

ATGGATTGGC TCTGGCTGGT TTGCTTCTTT CATCTAGTCA CTTCACTAGA AGACCTGCAT 240 

CCTGACCAAC CGGAAAGGTG AGCAGGATGA GGCCATTGGT GGTGCTGTC ATG ACT 295 



Met Thr 
1 



GAA ATA CTT CTG GAT ACA ACT GGA GAA ACC TCA GAG ATT GGC TGG ACC 
Glu He Leu Leu Asp Thr Thr Gly Glu Thr Ser Glu He Gly Trp Thr 
5 10 15 



343 



TCT CAC CCT CCT GAT GGG TGG GAA GAA GTA AGT GTC CGG GAT GAT AAG 
Ser His Pro Pro Asp Gly Trp Glu Glu Val Ser Val Arg Asp Asp Lys 
20 25 30 



391 



GAG CGC CAG ATC CGA ACC TTT CAA GTT TGT AAC ATG GAT GAA CCA GGT 
Glu Arg Gin He Arg Thr Phe Gin Val Cys Asn Met Asp Glu Pro Gly 
35 40 45 50 



439 
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CAG AAT AAC TGG TTG CGT ACT CAC TTC ATA GAG CGA CGT GGA GCC CAC 487 
Gin Asn Asn Trp Leu Arg Thr His Phe lie Glu Arg Arg Gly Ala His 
55 60 65 

CGA GTC CAT GTC CGC CTT CAT TTC TCA GTG AGG GAC TGT GCC AGC ATG 535 
Arg Val His Val Arg Leu His Phe Ser Val Arg Asp Cys Ala Ser Met 
70 75 80 

CGT ACT GTG GCC TCT ACT TGC AAA GAG ACT TTC ACA CTC TAC TAC CAC 583 
Arg Thr Val Ala Ser Thr Cys Lys Glu Thr Phe Thr Leu Tyr Tyr His 
85 90 95 

CAG TCA GAT GTC GAC ATA GCC TCT CAG GAA CTG CCA GAG TGG CAT GAA 631 
Gin Ser Asp Val Asp lie Ala Ser Gin Glu Leu Pro Glu Trp His Glu 
100 105 110 

GGC CCC TGG ACC AAG GTG GAT ACT ATT GCA GCT GAT GAA AGC TTT TCC 679 
Gly Pro Trp Thr Lys Val Asp Thr He Ala Ala Asp Glu Ser Phe Ser 
115 120 125 130 

CAG GTG GAC AGA ACT GGG AAG GTG GTA AGG ATG AAT GTT AAA GTA CGC 727 
Gin Val Asp Arg Thr Gly Lys Val Val Arg Met Asn Val Lys Val Arg 
135 140 145 

AGC TTT GGG CCA CTC ACA AAG CAT GGC TTC TAC CTG GCC TTC CAG GAC 775 
Ser Phe Gly Pro Leu Thr Lys His Gly Phe Tyr Leu Ala Phe Gin Asp 
150 155 160 

TCA GGA GCC TGT ATG TCC CTG GTG GCA GTC CAA GTC TTT TTC TAC AAG 823 
Ser Gly Ala Cys Met Ser Leu Val Ala Val Gin Val Phe Phe Tyr Lys 
165 170 175 

TGT CCA GCT GTG GTG AAA GGA TTT GCC TCC TTC CCT GAA ACT TTT GCT 871 
Cys Pro Ala Val Val Lys Gly Phe Ala Ser Phe Pro Glu Thr Phe Ala 
180 185 190 

GGA GGA GAG AGG ACC TCA CTG GTG GAG TCA CTA GGG ACG TGT GTA GCA 919 
Gly Gly Glu Arg Thr Ser Leu Val Glu Ser Leu Gly Thr Cys Val Ala 
195 200 205 "* 210 

AAT GCT GAA GAG GCA AGC ACA ACT GGG TCA TCA GGT GTT CGG TTG CAC 967 
Asn Ala Glu Glu Ala Ser Thr Thr Gly Ser Ser Gly Val Arg Leu His 
215 220 225 

TGC AAT GGA GAA GGA GAG TGG ATG GTG GCC ACT GGA CGA TGC TCT TGC 1015 
Cys Asn Gly Glu Gly Glu Trp Met Val Ala Thr Gly Arg Cys Ser Cys 
230 235 240 

AAG GCT GGT TAC CAA TCT GTT GAC AAT GAG CAA GCT TGT CAA GCT TGT 1063 
Lys Ala Gly Tyr Gin Ser Val Asp Asn Glu Gin Ala Cys Gin Ala Cys 
245 250 255 

CCC ATT GGT TCC TTT AAA GCA TCT GTG GGA GAT GAC CCT TGC CTT CTC 1111 
Pro He Gly Ser Phe Lys Ala Ser Val Gly Asp Asp Pro Cys Leu Leu 
260 265 270 

TGC CCT GCC CAC AGC CAT GCT CCA CTG CCA CTG CCA GGT TCC ATT GAA 1159 
Cys Pro Ala His Ser His Ala Pro Leu Pro Leu Pro Gly Ser He Glu 
275 280 285 290 

TGT GTG TGT CAG AGT CAC TAC TAC CGA TCT GCT TCT GAC AAT TCT GAT 1207 
Cys Val Cys Gin Ser His Tyr Tyr Arg Ser Ala Ser Asp Asn Ser Asp 
295 300 305 

GCT CCC TGC ACT GGC ATC CCC TCT GCT CCC CGT GAC CTC AGT TAT GAA 1255 
Ala Pro Cys Thr Gly He Pro Ser Ala Pro Arg Asp Leu Ser Tyr Glu 
310 315 ~~ 320 
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ATT GTT GGC TCC AAC GTG CTC CTG ACC TGG CGC CTC CCC AAG GAC TTG 1303 
He Val Gly Ser Asn Val Leu Leu Thr Trp Arg Leu Pro Lys Asp Leu 
325 330 335 

GGT GGC CGC AAG GAT GTC TTC TTC AAT GTC ATC TGC AAG GAA TGC CCA 1351 
Gly Gly Arg Lys Asp Val Phe Phe Asn Val lie Cys Lys Glu Cys Pro 
340 345 350 

ACA AGG TCA GCA GGG ACA TGT GTG CGC TGT GGG GAC AAT GTA CAG TTT 1399 
Thr Arg Ser Ala Gly Thr Cys Val Arg Cys Gly Asp Asn Val Gin Phe 
355 360 " 365 370 

GAA CCA CGC CAA GTG GGC CTG ACA GAA AGT CGT GTT CAA GTC TCC AAC 1447 
Glu Pro Arg Gin Val Gly Leu Thr Glu Ser Arg Val Gin Val Ser Asn 
375 380 385 

CTA TTG GCC CGT GTG CAG TAC ACT TTT GAG ATC CAG GCT GTC AAT TTG 1495 
Leu Leu Ala Arg Val Gin Tyr Thr Phe Glu He Gin Ala Val Asn Leu 
390 395 400 

GTG ACT GAG TTG AGT TCA GAA GCA CCC CAG TAT GCT ACC ATC AAC GTT 1543 
Val Thr Glu Leu Ser Ser Glu Ala Pro Gin Tyr Ala Thr He Asn Val 
405 410 415 

AGC ACC AGC CAG TCA GTG CCC TCC GCA ATC CCT ATG ATG CAT CAG GTG 1591 
Ser Thr Ser Gin Ser Val Pro Ser Ala He Pro Met Met His Gin Val 
420 425 430 

AGT CGT GCT ACC AGT AGC ATC ACA CTG TCT TGG CCT CAG CCA GAC CAG 1639 
Ser Arg Ala Thr Ser Ser He Thr Leu Ser Trp Pro Gin Pro Asp Gin 
435 440 445 450 

CCC AAT GGG GTT ATC CTG GAT TAC CAG CTA CGG TAC TTT GAC AAG GCA 1687 
Pro Asn Gly Val He Leu Asp Tyr Gin Leu Arg Tyr Phe Asp Lys Ala 
455 460 * 465 

GAA GAT GAG GAT AAT TCA TTT ACT TTG ACT AGT GAA ACT AAC ATG GCC 1735 
Glu Asp Glu Asp Asn Ser Phe Thr Leu Thr Ser Glu Thr Asn Met Ala 
470 475 480 

ACT ATA TTA AAT CTG AGT CCA GGC AAG ATC TAT GTC TTC CAA GTA CGA 1783 
Thr He Leu Asn Leu Ser Pro Gly Lys He Tyr Val Phe Gin Val Arg 
485 490 495 

GCT AGA ACA GCA GTG GGT TAT GGC CCA TAC AGT GGA AAG ATG TAT TTC 1831 
Ala Arg Thr Ala Val Gly Tyr Gly Pro Tyr Ser Gly Lys Met Tyr Phe 
500 505 510 

CAG ACT TTA ATG GCA GGA GAG CAC TCG GAG ATG GCA CAG GAC CGA CTG 1879 
Gin Thr Leu Met Ala Gly Glu His Ser Glu Met Ala Gin Asp Arg Leu 
515 520 525 530 

CCA CTT ATT GTG GGC TCA GCA CTT GGT GGT CTG GCA TTC TTG GTA ATT 1927 
Pro Leu He Val Gly Ser Ala Leu Gly Gly Leu Ala Phe Leu Val He 
535 540 545 

GCT GCC ATT GCC ATT CTT GCC ATC ATC TTC AAG AGT AAA AGG CGA GAG 1975 
Ala Ala He Ala He Leu Ala He He Phe Lys Ser Lys Arg Arg Glu 
550 555 560 

ACT CCA TAC ACA GAC CGC CTG CAG CAG TAT ATC AGT ACA CGA GGA CTT 2023 
Thr Pro Tyr Thr Asp Arg Leu Gin Gin Tyr He Ser Thr Arg Gly Leu 
565 570 575 

GGA GTG AAG TAT TAC ATT GAT CCT TCC ACQ TAT GAA GAT CCC AAT GAA 2071 
Gly Val Lys Tyr Tyr He Asp Pro Ser Thr Tyr Glu Asp Pro Asn Glu 
580 585 590 
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GCT ATT CGA GAG TTT GCC AAA GAG ATA GAT GTG TCC TTC ATC AAA ATT 2119 
Ala lie Arg Glu Phe Ala Lys Glu lie Asp Val Ser Phe He Lys He 
595 600 605 610 

GAG GAG GTC ATT GGA TCA GGA GAA TTT GGA GAG GTG TGC TTT GGG CGC 2167 
Glu Glu Val He Gly Ser Gly Glu Phe Gly Glu Val Cys Phe Gly Arg 
615 620 625 

CTA AAA CAC CCA GGG AAA CGT GAA TAC ACA GTA GCT ATT AAA ACC CTG 2215 
Leu Lys His Pro Gly Lys Arg Glu Tyr Thr Val Ala He Lys Thr Leu 
630 635 640 

AAG TCA GGT TAT ACT GAT GAA CAG CGT CGA GAG TTC CTG AGC GAG GCC 2263 
Lys Ser Gly Tyr Thr Asp Glu Gin Arg Arg Glu Phe Leu Ser Glu Ala 
645 650 655 

AGC ATC ATG GGG CAA TTT GAG CAT CCC AAT GTC ATC CAC CTG GAG GGC 2311 
Ser lie Met Gly Gin Phe Glu His Pro Asn Val He His Leu Glu Gly 
660 665 670 

GTG GTC ACC AAA AGC CGA CCA GTC ATG ATT GTC ACA GAA TTC ATG GAG 2359 
Val Val Thr Lys Ser Arg Pro Val Met He Val Thr Glu Phe Met Glu 
675 680 685 690 

AAT, GGA TCA CTG GAT TCC TTC CTC AGG GAG AAG GAG GGA CAG TTC AGT 2407 
Asn Gly Ser Leu Asp Ser Phe Leu Arg Glu Lys Glu Gly Gin Phe Ser 
695 700 705 

GTG TTA CAG CTG GTG GGA ATG CTA CGA GGG ATT GCA GCA GGC ATG CGC 2455 
Val Leu Gin Leu Val Gly Met Leu Arg Gly He Ala Ala Gly Met Arg 
710 715 720 

TAC CTT TCA GAC ATG AAC TAT GTG CAT CGT GAT CTC GCA GCA CGT AAC 2503 
Tyr Leu Ser Asp Met Asn Tyr Val His Arg Asp Leu Ala Ala Arg Asn 
725 730 735 

ATC TTA GTC AAC AGT AAC CTT GTA TGC AAG GTG TCA GAC TTT. GGT TTG 2551 
He Leu Val Asn Ser Asn Leu Val Cys Lys Val Ser Asp Phe Gly Leu 
740 745 750 

TCT CGC TTT CTG GAA GAT GAT GCT TCA AAT CCC ACT TAT ACT GGA GCT 2599 
Ser Arg Phe Leu Glu Asp Asp Ala Ser Asn Pro Thr Tyr Thr Gly Ala 
755 760 765 "* 770 

CTG GGT TGC AAA ATC CCC ATC CGT TGG ACT GCC CCT GAA GCT GTC CAG 2647 
Leu Gly Cys Lys He Pro He Arg Trp Thr Ala Pro Glu Ala Val Gin 
775 780 785 

TAT CGC AAG TTC ACC TCC TCC AGT GAT GTC TGG AGC TAT GGC ATT GTC 2695 
Tyr Arg Lys Phe Thr Ser Ser Ser Asp Val Trp Ser Tyr Gly He Val 
790 795 800 

ATG TGG GAG GTG ATG TCC TAT GGT GAG AGA CCT TAC TGG GAC ATG TCC 2743 
Met Trp Glu Val Met Ser Tyr Gly Glu Arg Pro Tyr Trp Asp Met Ser 
805 810 815 

AAC CAG GAT GTA ATT AAT GCC ATT GAC CAG GAC TAT CGC CTG CCA CCA 2791 
Asn Gin Asp Val lie Asn Ala He Asp Gin Asp Tyr Arg Leu Pro Pro 
820 825 830 

CCC CCA GAC TGC CCA ACT GTT TTG CAT CTG CTG ATG CTT GAC TGC TGG 2839 
Pro Pro Asp Cys Pro Thr Val Leu His Leu Leu Met Leu Asp Cys Trp 
835 840 845 850 

CAG AAG GAT CGA GTC CAG AGA CCA AAA TTT GAA CAA ATA GTC AGT GCC 2887 
Gin Lys Asp Arg Val Gin Arg Pro Lys Phe Glu Gin He Val Ser Ala 
855 860 865 
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CTA GAT AAA ATG ATC CGC AAG CCA TCT GCT CTC AAA GCC ACT GGC ACT 2935 
Leu Asp Lys Met lie Arg Lys Pro Ser Ala Leu Lys Ala Thr Gly Thr 
870 875 880 

GGG AGC AGC AGA CCA TCT CAG CCT CTC CTG AGC AAC TCC CCT CCA GAT 2983 
Gly Ser Ser Arg Pro Ser Gin Pro Leu Leu Ser Asn Ser Pro Pro Asp 
885 890 895 

TTT CCT TCA CTC AGC AAT GCC CAC GAG TGG TTG GAT GCC ATC AAG ATG 3031 
Phe Pro Ser Leu Ser Asn Ala His Glu Trp Leu Asp Ala lie Lys Met 
900 905 910 

GGT CGT TAC AAG GAG AAT TTT GAC CAG GCT GGT CTG ATT ACA TTT GAT 3079 
Gly Arg Tyr Lys Glu Asn Phe Asp Gin Ala Gly Leu lie Thr Phe Asp 
915 920 925 930 

GTC ATA TCA CGC ATG ACT CTG GAA GAT CTC CAG CGT ATT GGA ATC ACC 3127 
Val lie Ser Arg Met Thr Leu Glu Asp Leu Gin Arg lie Gly lie Thr 
935 940 945 

CTG GTT GGT CAC CAG AAA AAG ATT CTA AAC AGC ATC CAG CTC ATG AAA 3175 
Leu Val Gly His Gin Lys Lys lie Leu Asn Ser lie Gin Leu Met Lys 
950 955 960 

GTT CAT TTG AAC CAG CTT GAA CCA GTT GAA GTG TGATGCTTTA AGTCTCTATT 3228 
Val His Leu Asn Gin Leu Glu Pro Val Glu Val 
965 970 



TCACCAGACT 


CAAATTCTGA 


AAGAGTCCTG 


AGGGGATTCA GAGGGATTGT 


CACTGTATGA 


3288 


AAAGGAAATG 


GCAAGATGCT 


CCTTGAAGAC 


TTACTGCACC TAGAGAGTAG ACATTACACA 


3348 


TTCCATTCCA 


CCAGCAAAAA 


GAGAATCTTG 


CCATCATTTA AAAGCAGAGT 


T AAAT AG CTG 


3408 


GTGGTTAAAT 


ATGACTGGCA 


TCATACACTA 


GGAGTAGGTC AGGGAGGGAA AGTTATAGTA 


3468 


ATGCAGAGTG 


GAGCTGGTAT 


AATAGTTTGG 


ACAGACCACA AGCACCTGCT 


AGCTCTTCTC 


3528 


CACTAAATAA 


AAAATCAGAC 


AATTCTCCAG 


TGCCATCAGC AGGCTTTATC 


TGTGACTGGG 


3588 


AACAAAGAAA 


TCACAATTTT 


TCCAAGAGAG 


TATCAGCACA TTGTGAGAGT 


TATCACTCAG 


3648 


TTGGAAATGG 


ACATCACTTG 


CTATGCCAGA 


TTTGTGAGAA ACTGGAGTTC 


CACTGAGTGC 


3708 


ACCATATGTG 


GTAAACAATA 


AGGTACATCA 


CCTCGTAATT TTTACAGAGG 


TTGAGAGTAA 


3768 


AGGGCCCA 










3776 



(2) INFORMATION FOR SEQ ID NO:8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 973 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Met Thr Glu lie Leu Leu Asp Thr Thr Gly Glu Thr Ser Glu lie Gly 
1 5 ^ 10 15 

Trp Thr Ser His Pro Pro Asp Gly Trp Glu Glu Val Ser Val Arg Asp 
20 25 30 
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Asp Lys Glu Arg Gin lie Arg Thr Phe Gin Val Cys Asn Met Asp Glu 
35 40 45 

Pro Gly Gin Asn Asn Trp Leu Arg Thr His Phe lie Glu Arg Arg Gly 
50 55 60 

Ala His Arg Val His Val Arg Leu His Phe Ser Val Arg Asp Cys Ala 
65 70 75 80 

Ser Met Arg Thr Val Ala Ser Thr Cys Lys Glu Thr Phe Thr Leu Tyr 
85 90 95 

Tyr His Gin Ser Asp Val Asp lie Ala Ser Gin Glu Leu Pro Glu Trp 
100 105 110 

His Glu Gly Pro Trp Thr Lys Val Asp Thr lie Ala Ala Asp Glu Ser 
115 ~ 120 125 

Phe Ser Gin Val Asp Arg Thr Gly Lys Val Val Arg Met Asn Val Lys 
130 135 140 

Val Arg Ser Phe Gly Pro Leu Thr Lys His Gly Phe Tyr Leu Ala Phe 
145 150 155 160 

Gin Asp Ser Gly Ala Cys Met Ser Leu Val Ala Val Gin Val Phe Phe 
165 170 175 

Tyr Lys Cys Pro Ala Val Val Lys Gly Phe Ala Ser Phe Pro Glu Thr 
180 185 190 

Phe Ala Gly Gly Glu Arg Thr Ser Leu Val Glu Ser Leu Gly Thr Cys 
195 200 205 

Val Ala Asn Ala Glu Glu Ala Ser Thr Thr Gly Ser Ser Gly Val Arg 
210 215 220 

Leu His Cys Asn Gly Glu Gly Glu Trp Met Val Ala Thr Gly Arg Cys 
225 230 235 240 

Ser Cys Lys Ala Gly Tyr Gin Ser Val Asp Asn Glu Gin Ala Cys Gin 
245 250 255 

Ala Cys Pro lie Gly Ser Phe Lys Ala Ser Val Gly Asp Asp Pro Cys 
260 265 270 

Leu Leu Cys Pro Ala His Ser His Ala Pro Leu Pro Leu Pro Gly Ser 
275 280 285 

lie Glu Cys Val Cys Gin Ser His Tyr Tyr Arg Ser Ala Ser Asp Asn 
290 295 300 

Ser Asp Ala Pro Cys Thr Gly lie Pro Ser Ala Pro Arg Asp Leu Ser 
305 310 315 320 

Tyr Glu lie Val Gly Ser Asn Val Leu Leu Thr Trp Arg Leu Pro Lys 
325 330 335 

Asp Leu Gly Gly Arg Lys Asp Val Phe Phe Asn Val lie Cys Lys Glu 
340 345 350 

Cys Pro Thr Arg Ser Ala Gly Thr Cys Val Arg Cys Gly Asp Asn Val 
355 360 365 

Gin Phe Glu Pro Arg Gin Val Gly Leu Thr Glu Ser Arg Val Gin Val 
370 375 380 
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Ser Asn Leu Leu Ala Arg Val Gin Tyr Thr Phe Glu lie Gin Ala Val 
385 390 395 400 

Asn Leu Val Thr Glu Leu Ser Ser Glu Ala Pro Gin Tyr Ala Thr He 
405 410 415 

Asn Val Ser Thr Ser Gin Ser Val Pro Ser Ala He Pro Met Met His 
420 425 430 

Gin Val Ser Arg Ala Thr Ser Ser He Thr Leu Ser Trp Pro Gin Pro 
435 440 445 

Asp Gin Pro Asn Gly Val He Leu Asp Tyr Gin Leu Arg Tyr Phe Asp 
450 455 , 460 

Lys Ala Glu Asp Glu Asp Asn Ser Phe Thr Leu Thr Ser Glu Thr Asn 
465 470 475 480 

Met Ala Thr He Leu Asn Leu Ser Pro Gly Lys He Tyr Val Phe Gin 
485 490 495 

Val Arg Ala Arg Thr Ala Val Gly Tyr Gly Pro Tyr Ser Gly Lys Met 
500 505 510 

Tyr Phe Gin Thr Leu Met Ala Gly Glu His Ser Glu Met Ala Gin Asp 
515 520 525 

Arg Leu Pro Leu He Val Gly Ser Ala Leu Gly Gly Leu Ala Phe Leu 
530 535 540 

Val He Ala Ala He Ala He Leu Ala He He Phe Lys Ser Lys Arg 
545 550 555 560 

Arg Glu Thr Pro Tyr Thr Asp Arg Leu Gin Gin Tyr He Ser Thr Arg 
565 570 575 

Gly Leu Gly Val Lys Tyr Tyr He Asp Pro Ser Thr Tyr Glu Asp Pro 
580 585 590 

Asn Glu Ala He Arg Glu Phe Ala Lys Glu He Asp Val Ser Phe He 
595 600 605 

Lys He Glu Glu Val He Gly Ser Gly Glu Phe Gly Glu Val Cys Phe 
610 615 620 

Gly Arg Leu Lys His Pro Gly Lys Arg Glu Tyr Thr Val Ala He Lys 
625 630 635 640 

Thr Leu Lys Ser Gly Tyr Thr Asp Glu Gin Arg Arg Glu Phe Leu Ser 
645 650 655 

Glu Ala Ser He Met Gly Gin Phe Glu His Pro Asn Val He His Leu 
660 665 670 

Glu Gly Val Val Thr Lys Ser Arg Pro Val Met He Val Thr Glu Phe 
675 680 685 

Met Glu Asn Gly Ser Leu Asp Ser Phe Leu Arg Glu Lys Glu Gly Gin 
690 695 700 

Phe Ser Val Leu Gin Leu Val Gly Met Leu Arg Gly He Ala Ala Gly 
705 710 715 720 

Met Arg Tyr Leu Ser Asp Met Asn Tyr Val His Arg Asp Leu Ala Ala 



725 



730 



735 
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Arg Asn lie Leu Val Asn Ser Asn Leu Val Cys Lys Val Ser Asp Phe 
740 745 " 750 

Gly Leu Ser Arg Phe Leu Glu Asp Asp Ala Ser Asn Pro Thr Tyr Thr 
755 760 765 

Gly Ala Leu Gly Cys Lys lie Pro lie Arg Trp Thr Ala Pro Glu Ala 
770 775 780 

Val Gin Tyr Arg Lys Phe Thr Ser Ser Ser Asp Val Trp Ser Tyr Gly 
785 790 795 800 

He Val Met Trp Glu Val Met Ser Tyr Gly Glu Arg Pro Tyr Trp Asp 
805 810 815 

Met Ser Asn Gin Asp Val He Asn Ala He Asp Gin Asp Tyr Arg Leu 
820 825 830 

Pro Pro Pro Pro Asp Cys Pro Thr Val Leu His Leu Leu Met Leu Asp 
835 840 845 

Cys Trp Gin Lys Asp Arg Val Gin Arg Pro Lys Phe Glu Gin He Val 
850 855 860 

Ser Ala Leu Asp Lys Met He Arg Lys Pro Ser Ala Leu Lys Ala Thr 
665 870 875 " 880 

Gly Thr Gly Ser Ser Arg Pro Ser Gin Pro Leu Leu Ser Asn Ser Pro 
885 890 895 

Pro Asp Phe Pro Ser Leu Ser Asn Ala His Glu Trp Leu Asp Ala He 
900 905 910 

Lys Met Gly Arg Tyr Lys Glu Asn Phe Asp Gin Ala Gly Leu He Thr 
915 920 925 

Phe Asp Val He Ser Arg Met Thr Leu Glu Asp Leu Gin Arg He Gly 
930 935 940 

He Thr Leu Val Gly His Gin Lys Lys He Leu Asn Ser He Gin Leu 
945 950 955 960 

Met Lys Val His Leu Asn Gin Leu Glu Pro Val Glu Val 
965 970 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3546 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : both 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 2.. 2920 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

C GGG GTC TCC TCG AGG GCG CGG CGG CCG CCG GGC AGC AGC AGG AGC 46 
Gly Val Ser Ser Arg Ala Arg Arg Pro Pro Gly Ser Ser Arg Ser 
15 10 15 



WO 95/15375 PCT/US94/10140 



64 

AGC AGG AGG GGG GTG ACC TCG GAG CTG GCA TGG ACA ACC CAT CCG GAG 94 
Ser Arg Arg Gly Val Thr Ser Glu Leu Ala Trp Thr Thr His Pro Glu 
20 25 * 30 

ACG GGG TGG GAA GAG GTC AGT GGT TAC GAC GAG GCT ATG AAC CCC ATC 142 
Thr Gly Trp Glu Glu Val Ser Gly Tyr Asp Glu Ala Met Asn Pro lie 
35 40 45 

CGC ACA TAC CAG GTG TGC AAC GTG CGG GAG GCC AAC CAG AAC AAC TGG 190 
Arg Thr Tyr Gin Val Cys Asn Val Arg Glu Ala Asn Gin Asn Asn Trp 
50 55 60 

CTT CGC ACC AAG TTC ATT CAG CGC CAG GAC GTC CAG CGT GTC TAC GTG 238 
Leu Arg Thr Lys Phe lie Gin Arg Gin Asp Val Gin Arg Val Tyr Val 
65 70 75 

GAG CTG AAA TTC ACT GTG CGG GAC TGC AAC AGC ATC CCC AAC ATC CCT 286 
Glu Leu Lys Phe Thr Val Arg Asp Cys Asn Ser lie Pro Asn He Pro 
80 85 90 95 

GGT TCC TGC AAA GAG ACC TTC AAC CTC TTC TAT TAT GAG TCA GAT ACG 334 
Gly Ser Cys Lys Glu Thr Phe Asn Leu Phe Tyr Tyr Glu Ser Asp Thr 
100 105 110 

GAT TCT GCC TCT GCC AAT AGC CCT TTC TGG ATG GAG AAC CCC TAT ATC 382 
Asp Ser Ala Ser Ala Asn Ser Pro Phe Trp Met Glu Asn Pro Tyr He 
115 120 125 

AAA GTG GAT ACA ATT GCT CCG GAT GAG AGC TTC TCC AAA CTG GAG TCC 430 
Lys Val Asp Thr He Ala Pro Asp Glu Ser Phe Ser Lys Leu Glu Ser 
130 135 140 

GGC CGT GTG AAC ACC AAG GTG CGC AGC TTT GGG CCG CTC TCC AAG AAT 478 
Gly Arg Val Asn Thr Lys Val Arg Ser Phe Gly Pro Leu Ser Lys Asn 
145 150 155 

GGC TTT TAT CTG GCT TTC CAG GAC CTG GGG GCC TGC ATG TCC CTT ATC 526 
Gly Phe Tyr Leu Ala Phe Gin Asp Leu Gly Ala Cys Met Ser Leu He 
160 165 170 175 

TCC GTC CGG GCT TTC TAC AAG AAA TGT TCC AAC ACC ATC GCT GGC TTT 574 
Ser Val Arg Ala Phe Tyr Lys Lys Cys Ser Asn Thr He Ala Gly Phe 
180 185 190 

GCT ATC TTC CCG GAG ACC CTA ACG GGG GCT GAG CCC ACG TCG CTG GTC 622 
Ala He Phe Pro Glu Thr Leu Thr Gly Ala Glu Pro Thr Ser Leu Val 
195 200 205 

ATT GCG CCG GGC ACC TGC ATC CCC AAC GCA GTG GAA GTG TCT GTG CCC 670 
He Ala Pro Gly Thr Cys He Pro Asn Ala Val Glu Val Ser Val Pro 
210 215 220 

CTG AAG CTG TAC TGC AAC GGT GAT GGC GAG TGG ATG GTG CCT GTG GGA 718 
Leu Lys Leu Tyr Cys Asn Gly Asp Gly Glu Trp Met Val Pro Val Gly 
225 230 235 

GCG TGC ACG TGT GCT GCT GGG TAC GAG CCA GCC ATG AAG GAT ACC CAG 766 
Ala Cys Thr Cys Ala Ala Gly Tyr Glu Pro Ala Met Lys Asp Thr Gin 
240 245 250 255 

TGC CAA GCA TGC GGC CCG GGG ACG TTC AAA TCC AAG CAG GGC GAG GGC 814 
Cys Gin Ala Cys Gly Pro Gly Thr Phe Lys Ser Lys Gin Gly Glu Gly 
260 265 270 

CCC TGC TCC CCC TGC CCT CCC AAC AGC CGC ACC ACC GCG GGG GCA GCC 862 
Pro Cys Ser Pro Cys Pro Pro Asn Ser Arg Thr Thr Ala Gly Ala Ala 
275 280 285 
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ACA GTC TGC ATA TGT CGC AGC GGC TTC TTC CGA GCA GAC GCG GAC CCC 910 
Thr Val Cys lie Cys Arg Ser Gly Phe Phe Arg Ala Asp Ala Asp Pro 
290 295 300 

GCA GAC AGC GCC TGC ACC AGT GTG CCC TCA GCC CCA CGC AGC GTC ATC 956 
Ala Asp Ser Ala Cys Thr Ser Val Pro Ser Ala Pro Arg Ser Val lie 
305 310 315 

TCC AAC GTG AAT GAG ACG TCG TTG GTG CTG GAG TGG AGC GAG CCG CAG 1006 
Ser Asn Val Asn Glu Thr Ser Leu Val Leu Glu Trp Ser Glu Pro Gin 
320 325 330 335 

GAC GCG GGC GGG CGG GAT GAC CTG CTC TAC AAC GTC ATC TGC AAG AAG 1054 
Asp Ala Gly Gly Arg Asp Asp Leu Leu Tyr Asn Val lie Cys Lys Lys 
340 345 350 

TGC AGC GTG GAG CGG CGG CTG TGC AGC CGC TGC GAC GAC AAC GTG GAG 1102 
Cys Ser Val Glu Arg Arg Leu Cys Ser Arg Cys Asp Asp Asn Val Glu 
355 360 365 

TTC GTG CCG CGC CAG CTG GGC CTC ACT GGC CTC ACT GAG CGA CGC ATC 1150 
Phe Val Pro Arg Gin Leu Gly Leu Thr Gly Leu Thr Glu Arg Arg lie 
370 375 380 

TAC ATC AGC AAG GTG ATG GCC CAC CCC CAG TAC ACC TTC GAG ATC CAG 1198 
Tyr He Ser Lys Val Met Ala His Pro Gin Tyr Thr Phe Glu He Gin 
385 390 395 

GCG GTG AAT GGC ATC TCC AGC AAG AGC CCC TAC CCT CCC CAT TTT GCC 1246 
Ala Val Asn Gly He Ser Ser Lys Ser Pro Tyr Pro Pro His Phe Ala 
400 405 410 415 

TCC GTC AAC ATC ACG ACC AAC CAG GCA GCC CCA TCT GCC GTG CCC ACC 1294 
Ser Val Asn He Thr Thr Asn Gin Ala Ala Pro Ser Ala Val Pro Thr 
420 425 430 

ATG CAT CTG CAC AGC AGC ACC GGG AAC AGC ATG ACA CTG TCA TGG ACT 1342 
Met His Leu His Ser Ser Thr Gly Asn Ser Met Thr Leu Ser Trp Thr 
435 440 445 

CCC CCG GAA AGG CCC AAC GGC ATC ATT CTC GAC TAT GAA ATC AAG TAC 1390 
Pro Pro Glu Arg Pro Asn Gly He He Leu Asp Tyr Glu He Lys Tyr 
450 455 460 

TCC GAG AAG CAA GGC CAG GGT GAC GGC ATT GCC AAC ACT GTC ACC AGC 1438 
Ser Glu Lys Gin Gly Gin Gly Asp Gly He Ala Asn Thr Val Thr Ser 
465 470 475 

CAG AAG AAC TCG GTG CGG CTG GAC GGA CTG AAG GCC AAT GCT CGG TAC 1486 
Gin Lys Asn Ser Val Arg Leu Asp Gly Leu Lys Ala Asn Ala Arg Tyr 
480 485 490 495 

ATG GTG CAG GTC CGG GCG CGC ACA GTG GCT GGA TAC GGC CGC TAC AGC 1534 
Met Val Gin Val Arg Ala Arg Thr Val Ala Gly Tyr Gly Arg Tyr Ser 
500 505 * 510 

CTC CCC ACC GAG TTC CAG ACG ACT GCG GAG GAT GGC TCC ACC AGC AAG 1582 
Leu Pro Thr Glu Phe Gin Thr Thr Ala Glu Asp Gly Ser Thr Ser Lys 
515 520 525 

ACT TTC CAG GAG CTT CCT CTC ATC GTG GGT TCA GCC ACC GCG GGA CTG 1630 
Thr Phe Gin Glu Leu Pro Leu He Val Gly Ser Ala Thr Ala Gly Leu 
530 535 540 

CTG TTT GTC ATC GTG GTG GTC ATC ATC GCT ATT GTC TGC TTC AGG AAG 1678 
Leu Phe Val He Val Val Val He He Ala He Val Cys Phe Arg Lys 
545 550 555 
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CAG CGC AAC AGC ACA GAT CCC GAG TAC ACA GAG AAG CTG CAG CAA TAT 1726 
Gin Arg Asn Ser Thr Asp Pro Glu Tyr Thr Glu Lys Leu Gin Gin Tyr 
560 565 570 575 

GTC ACT CCT GGG ATG AAG GTC TAC ATT GAC CCC TTC ACC TAT GAA GAC 1774 
Val Thr Pro Gly Met Lys Val Tyr lie Asp Pro Phe Thr Tyr Glu Asp 
580 585 590 

CCA AAT GAA GCT GTC CGG GAA TTC GCC AAA GAG ATT GAT ATC TCC TGT 1822 
Pro Asn Glu Ala Val Arg Glu Phe Ala Lys Glu lie Asp He Ser Cys 
595 600 605 

GTC AAA ATT GAG GAG GTC ATT GGA GCA GGA GAG TTT GGT GAG GTG TGC 1870 
Val Lys He Glu Glu Val He Gly Ala Gly Glu Phe Gly Glu Val Cys 
610 615 620 

CGT GGG CGC CTG AAG CTG CCT GGC CGC CGT GAG ATC TTT GTG GCC ATC 1918 
Arg Gly Arg Leu Lys Leu Pro Gly Arg Arg Glu He Phe Val Ala He 
625 630 635 

AAG ACA CTG AAG GTG GGC TAC ACA GAG AGG CAG CGG CGG GAC TTC CTG 1966 
Lys Thr Leu Lys Val Gly Tyr Thr Glu Arg Gin Arg Arg Asp Phe Leu 
640 645 650 655 

AGT GAG GCC AGC ATC ATG GGC CAG TTC GAC CAC CCC AAC ATC ATC CAC 2014 
Ser Glu Ala Ser He Met Gly Gin Phe Asp His Pro Asn He lie His 
660 665 670 

CTG GAG GGC GTG GTG ACC AAG AGC CGC CCT GTC ATG ATC ATC ACA GAG 2062 
Leu Glu Gly Val Val Thr Lys Ser Arg Pro Val Met He He Thr Glu 
675 680 685 

TTC ATG GAG AAC TGC GCT CTC GAC TCC TTC CTC CGG CTG AAT GAT GGG 2110 
Phe Met Glu Asn Cys Ala Leu Asp Ser Phe Leu Arg Leu Asn Asp Gly 
690 695 700 

CAG TTC ACG GTC ATC CAG CTG GTG GGG ATG CTG CGA GGC ATC GCT GCT 2158 
Gin Phe Thr Val He Gin Leu Val Gly Met Leu Arg Gly He Ala Ala 
705 710 715 

GGC ATG AAG TAC CTC TCA GAG ATG AAC TAC GTG CAC CGA GAC CTG GCT 2206 
Gly Met Lys Tyr Leu Ser Glu Met Asn Tyr Val His Arg Asp Leu Ala 
720 725 730 735 

GCC CGC AAC ATC CTG GTC AAC AGC AAC TTG GTC TGC AAA GTG TCT GAC 2254 
Ala Arg Asn He Leu Val Asn Ser Asn Leu Val Cys Lys Val Ser Asp 
740 745 ' 750 

TTC GGG CTC TCC CGC TTT TTG GAG GAT GAT CCA GCC GAC CCC ACC TAC 2302 
Phe Gly Leu Ser Arg Phe Leu Glu Asp Asp Pro Ala Asp Pro Thr Tyr 
755 760 765 

ACC AGC TCC CTG GGA GGC AAG ATC CCC ATC AGG TGG ACA GCT CCT GAG 2350 
Thr Ser Ser Leu Gly Gly Lys He Pro He Arg Trp Thr Ala Pro Glu 
770 775 780 

GCC ATC GCC TAC CGC AAA TTC ACG TCG GCC AGC GAC GTG TGG AGC TAC 2398 
Ala He Ala Tyr Arg Lys Phe Thr Ser Ala Ser Asp Val Trp Ser Tyr 
785 790 795 

GGC ATC GTC ATG TGG GAA GTG ATG TCC TAC GGG GAG CGA CCC TAC TGG 2446 
Gly He Val Met Trp Glu Val Met Ser Tyr Gly Glu Arg Pro Tyr Trp 
800 80S 810 815 

GAC ATG TCC AAC CAG GAT GTG ATC AAC GCG GTG GAG CAG GAT TAC CGC 2494 
Asp Met Ser Asn Gin Asp Val He Asn Ala Val Glu Gin Asp Tyr Arg 
820 825 830 
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CTG CCA CCC CCC ATG GAC TGC CCC ACA GCA CTG 
Leu Pro Pro Pro Met Asp Cys Pro Thr Ala Leu 
835 840 

GAC TGC TGG GTG CGG GAC CGC AAC CTG CGG CCC 
Asp Cys Trp Val Arg Asp Arg Asn Leu Arg Pro 
850 855 

GTC AAC ACG CTG GAC AAG CTG ATC CGC AAT GCT 
Val Asn Thr Leu Asp Lys Leu lie Arg Asn Ala 
865 870 

ATC GCC AGC GTC CAG TCC GGT GTC TCC CAG CCG 
He Ala Ser Val Gin Ser Gly Val Ser Gin Pro 
880 885 890 

GTG CCC GAT TAC ACC ACC TTC ACC ACC GTG GGA 
Val Pro Asp Tyr Thr Thr Phe Thr Thr Val Gly 
900 905 

ATC AAA ATG GGA CGG TAC AAG GAG AAC TTC GTC 
He Lys Met Gly Arg Tyr Lys Glu Asn Phe Val 
915 920 

TCC TTT GAC CTG GTG GCA CAG ATG ACA GCA GAG 
Ser Phe Asp Leu Val Ala Gin Met Thr Ala Glu 
930 935 

GGA GTG ACG CTA GCA GGG CAC CAG AAG AAG ATC 
Gly Val Thr Leu Ala Gly His Gin Lys Lys He 
945 950 

GAC ATG AGG CTG CAG ATG AAC CAG ACG CTG CCG 
Asp Met Arg Leu Gin Met Asn Gin Thr Leu Pro 
960 965 970 

TGACCGCAGG GACTCTGCAT TGGAACGGAC TGAGGGAACC 

CGGTGCAGCC CGGCTTCCCG ATTTCCCCTT CCCGTGGCGC 

CCGGGGACAG GCTGGGCCGG GCCACCCTTC CCTGGATCAG 

AGCCCGGCTT TTCGTCCCGT GTCCCGCAGC GGCGAGGCAG 

AAGATGGATT ATGGGACGGA GATGGCGCAT CCGCTTCCCG 

GTTTGAAGAG ATGTTCTGCT TCTTGGATTT CTTTACACCC 

CTCACTTCCC CCTATCCCTG AGGCCACAGA CTGTTGACCC 

GCTCCGAAGC CTTCCCCGAG CCCGGTCCCC GCGTGGAGAC 

GGCCCCAGAC AATCACTCCA CCCCTCCGCA CGAGGGTCCT 

GGAAAGGCTC TGCTCCCTTT TTGGCTTTGC ACGCCAGAAC 

TATGCAGGGA GTTAGGCAAA AAAAAG 



CAC CAG CTG ATG CTG 
His Gin Leu Met Leu 
845 

AAG TTT GCA CAG ATT 
Lys Phe Ala Gin He 
860 

GCC AGC CTG AAG GTC 
Ala Ser Leu Lys Val 
875 

CTC CTG GAC CGC ACC 
Leu Leu Asp Arg Thr 
895 

GAC TGG CTG GAT GCC 
Asp Trp Leu Asp Ala 
910 

AAC GCC GGC TTC GCC 
Asn Ala Gly Phe Ala 
925 

GAC CTG CTA AGG ATA 
Asp Leu Leu Arg He 
940 

CTG AGC AGC ATT CAG 
Leu Ser Ser He Gin 
955 

GTT CAG GTT 
Val Gin Val 



TGCCAACCAG 
TCCTCTGCCT 
AGGCACTCGT 
TGAACGCAGT 
CCCTGTCTCA 
CGGTTTTCCC 
GTCCGCTGAG 
GGCGCCAGGG 
CACTGGGACG 
CCGAACCCCG 



GTTCTGTTTG 
CGGACGCTCG 
GCCGGGAGGG 
CTTCATATTG 
GTGCTCATCA 
CCCTCGAGTC 
TCCGTCAGAC 
ACGGGGCTAC 
TGTCTGAAGG 
TGAGATTTAC 



2542 

2590 

2638 

2686 

2734 

2782 

2830 

2878 

2920 

2980 
3040 
3100 
3160 
3220 
3280 
3340 
3400 
3460 
3520 
3546 



(2) INFORMATION FOR SEQ ID N0:10: 

(i) SEQUENCE CHARACTERISTICS: 

{A) LENGTH: 973 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 
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(XX) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Gly Val Ser Ser Arg Ala Arg Arg Pro Pro Gly Ser Ser Arg Ser Ser 
1 5 10 15 

Arg Arg Gly Val Thr Ser Glu Leu Ala Trp Thr Thr His Pro Glu Thr 
20 25 30 

Gly Trp Glu Glu Val Ser Gly Tyr Asp Glu Ala Met Asn Pro lie Arg 
35 40 45 

Thr Tyr Gin Val Cys Asn Val Arg Glu Ala Asn Gin Asn Asn Trp Leu 
50 55 60 

Arg Thr Lys Phe He Gin Arg Gin Asp Val Gin Arg Val Tyr Val Glu 
65 70 75 80 

Leu Lys Phe Thr Val Arg Asp Cys Asn Ser He Pro Asn He Pro Gly 
85 90 95 

Ser Cys Lys Glu Thr Phe Asn Leu Phe Tyr Tyr Glu Ser Asp Thr Asp 
100 105 110 

Ser Ala Ser Ala Asn Ser Pro Phe Trp Met Glu Asn Pro Tyr He Lys 
115 120 125 

Val Asp Thr He Ala Pro Asp Glu Ser Phe Ser Lys Leu Glu Ser Gly 
130 135 140 

Arg Val Asn Thr Lys Val Arg Ser Phe Gly Pro Leu Ser Lys Asn Gly 
145 150 155 160 

Phe Tyr Leu Ala Phe Gin Asp Leu Gly Ala Cys Met Ser Leu He Ser 
165 170 175 

Val Arg Ala Phe Tyr Lys Lys Cys Ser Asn Thr He Ala Gly Phe Ala 
180 185 190 

He Phe Pro Glu Thr Leu Thr Gly Ala Glu Pro Thr Ser Leu Val He 
195 200 205 

Ala Pro Gly Thr Cys He Pro Asn Ala Val Glu Val Ser Val Pro Leu 
210 215 220 

Lys Leu Tyr Cys Asn Gly Asp Gly Glu Trp Met Val Pro Val Gly Ala 
225 230 235 240 

Cys Thr Cys Ala Ala Gly Tyr Glu Pro Ala Met Lys Asp Thr Gin Cys 
245 250 255 

Gin Ala Cys Gly Pro Gly Thr Phe Lys Ser Lys Gin Gly Glu Gly Pro 
260 265 270 

Cys Ser Pro Cys Pro Pro Asn Ser Arg Thr Thr Ala Gly Ala Ala Thr 
275 280 285 

Val Cys He Cys Arg Ser Gly Phe Phe Arg Ala Asp Ala Asp Pro Ala 
290 295 300 

Asp Ser Ala Cys Thr Ser Val Pro Ser Ala Pro Arg Ser Val He Ser 
305 310 315 320 

Asn Val Asn Glu Thr Ser Leu Val Leu Glu Trp Ser Glu Pro Gin Asp 
325 330 335 

Ala Gly Gly Arg Asp Asp Leu Leu Tyr Asn Val He Cys Lys Lys Cys 
340 345 350 
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Ser Val Glu Arg Arg Leu Cys Ser Arg Cys Asp Asp Asn Val Glu Phe 
355 360 365 

Val Pro Arg Gin Leu Gly Leu Thr Gly Leu Thr Glu Arg Arg lie Tyr 
370 375 380 

lie Ser Lys Val Met Ala His Pro Gin Tyr Thr Phe Glu He Gin Ala 
385 390 395 400 

Val Asn Gly He Ser Ser Lys Ser Pro Tyr Pro Pro His Phe Ala Ser 
405 410 415 

Val Asn He Thr Thr Asn Gin Ala Ala Pro Ser Ala Val Pro Thr Met 
420 425 430 

His Leu His Ser Ser Thr Gly Asn Ser Met Thr Leu Ser Trp Thr Pro 
435 440 445 

Pro Glu Arg Pro Asn Gly He He Leu Asp Tyr Glu He Lys Tyr Ser 
450 455 460 

Glu Lys Gin Gly Gin Gly Asp Gly He Ala Asn Thr Val Thr Ser Gin 
465 470 475 480 

Lys Asn Ser Val Arg Leu Asp Gly Leu Lys Ala Asn Ala Arg Tyr Met 
485 490 495 

Val Gin Val Arg Ala Arg Thr Val Ala Gly Tyr Gly Arg Tyr Ser Leu 
500 505 510 

Pro Thr Glu Phe Gin Thr Thr Ala Glu Asp Gly Ser Thr Ser Lys Thr 
515 520 525 

Phe Gin Glu Leu Pro Leu He Val Gly Ser Ala Thr Ala Gly Leu Leu 
530 535 540 

Phe Val He Val Val Val He He Ala lie Val Cys Phe Arg Lys Gin 
545 550 555 560 

Arg Asn Ser Thr Asp Pro Glu Tyr Thr Glu Lys Leu Gin Gin Tyr Val 
565 570 575 

Thr Pro Gly Met Lys Val Tyr He Asp Pro Phe Thr Tyr Glu Asp Pro 
580 585 590 

Asn Glu Ala Val Arg Glu Phe Ala Lys Glu lie Asp He Ser Cys Val 
595 600 605 

Lys lie Glu Glu Val He Gly Ala Gly Glu Phe Gly Glu Val Cys Arg 
610 615 620 

Gly Arg Leu Lys Leu Pro Gly Arg Arg Glu lie Phe Val Ala lie Lys 
625 630 635 640 

Thr Leu Lys Val Gly Tyr Thr Glu Arg Gin Arg Arg Asp Phe Leu Ser 
645 650 655 

Glu Ala Ser He Met Gly Gin Phe Asp His Pro Asn He lie His Leu 
660 665 670 

Glu Gly Val Val Thr Lys Ser Arg Pro Val Met He lie Thr Glu Phe 
675 680 685 

Met Glu Asn Cys Ala Leu Asp Ser Phe Leu Arg Leu Asn Asp Gly Gin 
690 695 700 
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Phe Thr Val lie Gin Leu Val Gly Met Leu Arg Gly lie Ala Ala Gly 
705 710 715 720 

Met Lys Tyr Leu Ser Glu Met Asn Tyr Val His Arg Asp Leu Ala Ala 
725 730 735 

Arg Asn lie Leu Val Asn Ser Asn Leu Val Cys Lys Val Ser Asp Phe 
740 745 750 

Gly Leu Ser Arg Phe Leu Glu Asp Asp Pro Ala Asp Pro Thr Tyr Thr 
755 760 765 

Ser Ser Leu Gly Gly Lys He Pro He Arg Trp Thr Ala Pro Glu Ala 
770 775 780 

He Ala Tyr Arg Lys Phe Thr Ser Ala Ser Asp Val Trp Ser Tyr Gly 
785 790 795 800 

He Val Met Trp Glu Val Met Ser Tyr Gly Glu Arg Pro Tyr Trp Asp 
805 810 815 

Met Ser Asn Gin Asp Val He Asn Ala Val Glu Gin Asp Tyr Arg Leu 
820 825 830 

Pro Pro Pro Met Asp Cys Pro Thr Ala Leu His Gin Leu Met Leu Asp 
835 840 845 

Cys Trp Val Arg Asp Arg Asn Leu Arg Pro Lys Phe Ala Gin He Val 
850 855 860 

Asn Thr Leu Asp Lys Leu He Arg Asn Ala Ala Ser Leu Lys Val He 
865 870 875 880 

Ala Ser Val Gin Ser Gly Val Ser Gin Pro Leu Leu Asp Arg Thr Val 
885 890 895 

Pro Asp Tyr Thr Thr Phe Thr Thr Val Gly Asp Trp Leu Asp Ala He 
900 905 910 

Lys Met Gly Arg Tyr Lys Glu Asn Phe Val Asn Ala Gly Phe Ala Ser 
915 920 925 

Phe Asp Leu Val Ala Gin Met Thr Ala Glu Asp Leu Leu Arg He Gly 
930 935 940 

Val Thr Leu Ala Gly His Gin Lys Lys He Leu Ser Ser He Gin Asp 
945 950 955 960 

Met Arg Leu Gin Met Asn Gin Thr Leu Pro Val Gin Val 



965 



970 



(2) INFORMATION FOR SEQ ID NO:ll: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4097 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : both 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 10.. 3042 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
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CGGCTTCTG ATG CCC GGC CCG GAG CGC ACC ATG GGG CCG TTG TGG TTC 48 
Met Pro Gly Pro Glu Arg Thr Met Gly Pro L»eu Trp Phe 
1 5 10 

TGC TGT TTG CCC CTC GCC CTC TTG CCT CTG CTC GCC GCC GTG GAA GAG 96 
Cys Cys Leu Pro Leu Ala Leu Leu Pro Leu Leu Ala Ala Val Glu Glu 
15 20 25 

ACG CTG ATG GAC TCC ACA ACG GCC ACA GCA GAG CTG GGC TGG ATG GTG 144 
Thr Leu Met Asp Ser Thr Thr Ala Thr Ala Glu Leu Gly Trp Met Val 
30 35 40 45 

CAT CCT CCC TCA GGG TGG GAA GAG GTG AGT GGA TAC GAT GAG AAC ATG 192 
His Pro Pro Ser Gly Trp Glu Glu Val Ser Gly Tyr Asp Glu Asn Met 
50 55 60 

AAC ACC ATC CGC ACC TAC CAG GTG TGC AAC GTC TTT GAA TCC AGC CAA 240 
Asn Thr He Arg Thr Tyr Gin Val Cys Asn Val Phe Glu Ser Ser Gin 
65 70 75 

AAC AAC TGG CTG CGG ACC AAG TAC ATC CGG AGG CGA GGA GCG CAC CGC 288 
Asn Asn Trp Leu Arg Thr Lys Tyr He Arg Arg Arg Gly Ala His Arg 
80 85 90 

ATC CAC GTG GAG ATG AAA TTC TCC GTT CGG GAC TGC AGC AGC ATC CCC 336 
He His Val Glu Met Lys Phe Ser Val Arg Asp Cys Ser Ser He Pro 
95 100 105 

AAC GTC CCG GGC TCC TGT AAG GAG ACT TTT AAC CTC TAT TAC TAC GAA 384 
Asn Val Pro Gly Ser Cys Lys Glu Thr Phe Asn Leu Tyr Tyr Tyr Glu 
HO 115 120 125 

TCA GAC TTT GAC TCT GCC ACC AAG ACT TTT CCT AAC TGG ATG GAA AAC 432 
Ser Asp Phe Asp Ser Ala Thr Lys Thr Phe Pro Asn Trp Met Glu Asn 
130 135 140 

CCT TGG ATG AAG GTA GAT ACA ATT GCT GCC GAC GAG AGC TTC TCG CAG 480 
Pro Trp Met Lys Val Asp Thr He Ala Ala Asp Glu Ser Phe Ser Gin 
145 150 155 

GTG GAC CTT GGT GGG CGG GTG ATG AAG ATT AAC ACC GAG GTG CGC AGT 528 
Val Asp Leu Gly Gly Arg Val Met Lys He Asn Thr Glu Val Arg Ser 
160 165 170 

TTT GGG CCT GTC TCC AAA AAC GGT TTC TAC CTG GCC TTC CAG GAC TAC 576 
Phe Gly Pro Val Ser Lys Asn Gly Phe Tyr Leu Ala Phe Gin Asp Tyr 
175 180 185 

GGG GGC TGC ATG TCC TTG ATT GCA GTC CGT GTC TTT TAC CGC AAG TGT 624 
Gly Gly Cys Met Ser Leu He Ala Val Arg Val Phe Tyr Arg Lys Cys 
190 195 200 205 

CCC CGT GTG ATC CAG AAC GGG GCG GTC TTC CAG GAA ACC CTC TCG GGA 672 
Pro Arg Val He Gin Asn Gly Ala Val Phe Gin Glu Thr Leu Ser Gly 
210 215 220 

GCG GAG AGC ACA TCT CTG GTG GCA GCC CGG GGG ACG TGC ATC AGC AAT 720 
Ala Glu Ser Thr Ser Leu Val Ala Ala Arg Gly Thr Cys He Ser Asn 
225 230 235 

GCG GAG GAG GTG GAT GTG CCC ATC AAG CTG TAC TGC AAT GGG GAT GGC 768 
Ala Glu Glu Val Asp Val Pro He Lys Leu Tyr Cys Asn Gly Asp Gly 
240 245 250 

GAG TGG CTG GTG CCC ATC GGC CGC TGC ATG TGC AGG CCG GGC TAT GAG 816 
Glu Trp Leu Val Pro He Gly Arg Cys Met Cys Arg Pro Gly Tyr Glu 
255 260 265 
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TCG GTG GAG AAT GGG ACC GTC TGC AGA GGC TGC CCA TCA GGG ACC TTC 864 
Ser Val Glu Asn Gly Thr Val Cys Arg Gly Cys Pro Ser Gly Thr Phe 
270 275 280 285 

AAG GCC AGC CAA GGA GAT GAA GGA TGT GTC CAT TGT CCA ATT AAC AGC 912 
Lys Ala Ser Gin Gly Asp Glu Gly Cys Val His Cys Pro He Asn Ser 
290 295 300 

CGG ACG ACT TCG GAA GGG GCC ACG AAC TGC GTG TGC CGA AAC GGA TAT 960 
Arg Thr Thr Ser Glu Gly Ala Thr Asn Cys Val Cys Arg Asn Gly Tyr 
305 310 315 

TAC CGG GCA GAT GCT GAC CCC GTC GAC ATG CCA TGC ACC ACC ATC CCA 1008 
Tyr Arg Ala Asp Ala Asp Pro Val Asp Met Pro Cys Thr Thr He Pro 
320 325 330 

TCT GCC CCC CAG GCC GTG ATC TCC AGC GTG AAT GAA ACC TCC CTG ATG 1056 
Ser Ala Pro Gin Ala Val He Ser Ser Val Asn Glu Thr Ser Leu Met 
335 340 345 

CTG GAG TGG ACC CCA CCA CGA GAC TCA GGG GGC CGG GAG GAT CTG GTA 1104 
Leu Glu Trp Thr Pro Pro Arg Asp Ser Gly Gly Arg Glu Asp Leu Val 
350 355 360 365 

TAC AAC ATC ATC TGC AAG AGC TGT GGG TCA GGC CGT GGG GCG TGC ACG 1152 
Tyr Asn He He Cys Lys Ser Cys Gly Ser Gly Arg Gly Ala Cys Thr 
370 375 380 

CGC TGT GGG GAC AAC GTG CAG TTT GCC CCA CGC CAG CTG GGC CTG ACG 1200 
Arg Cys Gly Asp Asn Val Gin Phe Ala Pro Arg Gin Leu Gly Leu Thr 
385 390 395 

GAG CCT CGC ATC TAC ATC AGC GAC CTG CTG GCC CAC ACG CAG TAC ACC 1248 
Glu Pro Arg He Tyr He Ser Asp Leu Leu Ala His Thr Gin Tyr Thr 
400 405 410 

TTT GAG ATC CAG GCT GTG AAT GGG GTC ACC GAC CAG AGC CCC TTC TCC 1296 
Phe Glu He Gin Ala Val Asn Gly Val Thr Asp Gin Ser Pro Phe Ser 
415 420 425 

CCA CAG TTT GCA TCA GTG AAT ATC ACC ACC AAC CAG GCT GCT CCT TCA 1344 
Pro Gin Phe Ala Ser Val Asn He Thr Thr Asn Gin Ala Ala Pro Ser 
430 435 440 445 

GCC GTG TCC ATA ATG CAC CAG GTC AGC CGC ACT GTG GAC AGC ATT ACC 1392 
Ala Val Ser lie Met His Gin Val Ser Arg Thr Val Asp Ser He Thr 
450 455 460 

CTC TCG TGG TCT CAA CCT GAC CAG CCC AAT GGA GTC ATC CTG GAT TAT 1440 
Leu Ser Trp Ser Gin Pro Asp Gin Pro Asn Gly Val He Leu Asp Tyr 
465 470 475 

GAG CTG CAA TAC TAT GAG AAG AAC CTG AGT GAG TTA AAT TCA ACA GCA 1488 
Glu Leu Gin Tyr Tyr Glu Lys Asn Leu Ser Glu Leu Asn Ser Thr Ala 
480 485 490 

GTG AAG AGC CCC ACC AAC ACT GTG ACA GTG CAA AAC CTC AAA GCT GGC 1536 
Val Lys Ser Pro Thr Asn Thr Val Thr Val Gin Asn Leu Lys Ala Gly 
495 500 505 

ACC ATC TAT GTC TTC CAA GTG CGA GCA CGT ACC GTG GCT GGG TAT GGC 1584 
Thr He Tyr Val Phe Gin Val Arg Ala Arg Thr Val Ala Gly Tyr Gly 
510 515 520 525 

CGG TAT AGT GGC AAG ATG TAC TTC CAG ACC ATG ACT GAA GCC GAG TAC 1632 
Arg Tyr Ser Gly Lys Met Tyr Phe Gin Thr Met Thr Glu Ala Glu Tyr 
530 535 540 
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CAG ACC AGT 6TC CAG GAG AAG CTG CCA CTC ATC ATT GGC TCC TCT GCA 1680 

Gin Thr Ser Val Gin Glu Lys Leu Pro Leu lie He Gly Ser Ser Ala 

545 550 555 

GCA GGA CTG GTG TTT CTC ATT GCT GTT GTC GTC ATC ATT ATT GTC TGC 1728 
Ala Gly Leu Val Phe Leu He Ala Val Val Val He He He Val Cys 
560 565 570 

AAC AGA AGA CGG GGC TTT GAA CGT GCT GAC TCT GAG TAC ACT GAC AAG 1776 
Asn Arg Arg Arg Gly Phe Glu Arg Ala Asp Ser Glu Tyr Thr Asp Lys 
575 580 585 

CTG CAG CAC TAT ACC AGT GGC CAC AGT ACG TAC CGT GGT CCC CCG CCA 1824 
Leu Gin His Tyr Thr Ser Gly His Ser Thr Tyr Arg Gly Pro Pro Pro 
590 595 600 605 

GGC CTG GGG GTC CGC TCT CTC TTC GTG ACT CCA GGG ATG AAG ATT TAT 1872 
Gly Leu Gly Val Arg Ser Leu Phe Val Thr Pro Gly Met Lys He Tyr 
610 615 620 

ATC GAT CCA TTT ACC TAC GAA GAT CCC AAT GAG GCT GTC AGG GAA TTT 1920 
He Asp Pro Phe Thr Tyr Glu Asp Pro Asn Glu Ala Val Arg Glu Phe 
625 630 635 

GCA AAA GAA ATT GAT ATC TCC TGT GTG AAA ATC GAG CAG GTG ATT GGG 1968 
Ala Lys Glu He Asp He Ser Cys Val Lys He Glu Gin Val He Gly 
640 645 650 

GCA GGG GAG TTT GGT GAG GTG TGC AGT GGG CAT CTC AAG CTT CCT GGC 2016 
Ala Gly Glu Phe Gly Glu Val Cys Ser Gly His Leu Lys Leu Pro Gly 
655 660 665 

AAA AGA GAG ATC TTT GTG GCC ATC AAG ACC CTG AAG TCT GGT TAC ACA 2064 
Lys Arg Glu He Phe Val Ala He Lys Thr Leu Lys Ser Gly Tyr Thr 
670 675 680 685 

GAG AAG CAG AGA CGG GAC TTC CTG AGT GAA GCC AGC ATC ATG GGG CAG 2112 
Glu Lys Gin Arg Arg Asp Phe Leu Ser Glu Ala Ser He Met Gly Gin 
690 695 700 

TTT GAC CAC CCC AAT GTC ATC CAC CTG GAA GGG GTG GTG ACC AAG AGT 2160 
Phe Asp His Pro Asn Val He His Leu Glu Gly Val Val Thr Lys Ser 
705 710 715 

TCC CCA GTC ATG ATC ATT ACA GAG TTC ATG GAG AAT GGC TCG TTG GAC 2208 
Ser Pro Val Met He He Thr Glu Phe Met Glu Asn Gly Ser Leu Asp 
720 725 730 

TCC TTC TTG AGG CAA AAT GAT GGG CAG TTC ACA GTG ATC CAG CTG GTG 2256 
Ser Phe Leu Arg Gin Asn Asp Gly Gin Phe Thr Val He Gin Leu Val 
735 740 745 

GGC ATG TTG CGT GGC ATT GCA GCA GGC ATG AAG TAC CTG GCT GAT ATG 2304 
Gly Met Leu Arg Gly He Ala Ala Gly Met Lys Tyr Leu Ala Asp Met 
750 755 760 765 

AAC TAC GTG CAC CGG GAC CTG GCT GCC CGC AAC ATC CTG GTC AAC AGC 2352 
Asn Tyr Val His Arg Asp Leu Ala Ala Arg Asn He Leu Val Asn Ser 
770 775 780 

AAC CTG GTC TGC AAG GTG TCC GAC TTC GGC CTC TCC CGT TTC CTG GAG 2400 
Asn Leu Val Cys Lys Val Ser Asp Phe Gly Leu Ser Arg Phe Leu Glu 
785 790 795 

GAT GAC ACC TCT GAT CCC ACT TAC ACC AGC GCA CTG GGT GGA AAG ATC 2448 
Asp Asp Thr Ser Asp Pro Thr Tyr Thr Ser Ala Leu Gly Gly Lys He 
800 805 810 
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CCA ATA CGG TGG ACA GCG CCT GAG GCA ATT CAG TAC CGA AAA TTC ACA 2496 
Pro He Arg Trp Thr Ala Pro Glu Ala He Gin Tyr Arg Lys Phe Thr 
815 820 825 

TCA GCC AGC GAT GTG TGG AGC TAT GGA ATA GTC ATG TGG GAG GTG ATG ^2544 
Ser Ala Ser Asp Val Trp Ser Tyr Gly He Val Met Trp Glu Val Met 
830 835 840 845 

TCG TAC GGC GAG CGG CCT TAC TGG GAC ATG ACC AAT CAA GAT GTG ATA 2592 
Ser Tyr Gly Glu Arg Pro Tyr Trp Asp Met Thr Asn Gin Asp Val He 
850 855 860 

AAT GCT ATT GAG CAG GAC TAT CGG CTA CCA CCC CCT ATG GAT TGT CCA 2640 
Asn Ala He Glu Gin Asp Tyr Arg Leu Pro Pro Pro Met Asp Cys Pro 
865 870 875 

AAT GCC CTG CAC CAG CTA ATG CTT GAC TGC TGG CAG AAG GAT CGA AAC 2688 
Asn Ala Leu His Gin Leu Met Leu Asp Cys Trp Gin Lys Asp Arg Asn 
880 885 890 

CAC AGA CCC AAA TTT GGA CAG ATT GTC AAC ACT TTA GAC AAA ATG ATC 2736 
His Arg Pro Lys Phe Gly Gin He Val Asn Thr Leu Asp Lys Met He 
895 900 905 

CGA AAT CCT AAT AGT CTG AAA GCC ATG GCA CCT CTC TCC TCT GGG GTT 2784 
Arg Asn Pro Asn Ser Leu Lys Ala Met Ala Pro Leu Ser Ser Gly Val 
910 915 920 925 

AAC CTC CCT CTA CTT GAC CGC ACA ATC CCA GAT TAT ACC AGC TTC AAC 2832 
Asn Leu Pro Leu Leu Asp Arg Thr lie Pro Asp Tyr Thr Ser . Phe Asn 
930 " 935 940 

ACT GTG GAT GAA TGG CTG GAT GCC ATC AAG ATG AGC CAG TAC AAG GAG 2880 
Thr Val Asp Glu Trp Leu Asp Ala He Lys Met Ser Gin Tyr Lys Glu 
945 950 955 

AGC TTT GCC AGT GCT GGC TTC ACC ACC TTT GAT ATA GTA TCT CAG ATG 2928 
Ser Phe Ala Ser Ala Gly Phe Thr Thr Phe Asp He Val Ser Gin Met 
960 965 970 

ACT GTA GAG GAC ATT CTA CGA GTT GGG GTC ACT TTA GCA GGA CAC CAG 2976 
Thr Val Glu Asp He Leu Arg Val Gly Val Thr Leu Ala Gly His Gin 
975 980 985 

AAG AAA ATT CTG AAC AGT ATC CAG GTG ATG AGA GCA CAG ATG AAC CAA 3024 
Lys Lys He Leu Asn Ser He Gin Val Met Arg Ala Gin Met Asn Gin 
990 995 1000 1005 

ATT CAG TCT GTG GAG GTT TGATAGCAAC ACGTCCTCGT GCTCCACTTC 3072 
He Gin Ser Val Glu Val 
1010 



CTTGAGGCCC 


TGCTCCCCTC 


TGCCCCTGTG 


TGTCTGAGCT 


CCAGTTCTTG 


AGTGTTCTGC 


3132 


GTGGATCAGA 


GACAGGCAGC 


TGCTCTGAGG 


ATCATGGCAA 


CAGGAAGAAA 


TGCCCTATCA 


3192 


TTGACAACGA 


GAAGTCATCA 


AGAGGTGAAA 


CAATGGAAAA 


CAATGGAAAA 


AGGGAACAAG 


3252 


TAAAGACAGC 


TATTTTGAAA 


ACCGAAAACA 


AACAGTGAAT 


TATTTTTAAA 


TAATAATAAA 


3312 


GCAATTGCAG 


TCTTGAAAAG 


GGCTCCAAGA 


CCAATGGGAG 


TCTCCAAAGG 


AAGAGAATAG 


3372 


AGCAGCTTCA 


TCTATTTCCT 


CTTACACAAG 


GGTTGCTGCA 


GCTGGGCCCA 


GACACTTCTG 


3432 


GAGTAACGAG 


ACTTTTCAAG 


AAGATGAATG 


CAAAGAATGG 


TCACAAGAAG 


CACTTCTCTT 


3492 



TCTCACATGG GATGGCAGCT CTGGGAATGC CCGGCAGTCC TTCCTGAAAG CCCTGTTGGC 3552 
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AAATCGAAGA GGAGAGCCGA AGCTCTTTGG TGCTGTGGAA CCAAGTGCAT CTCAGAAATT 
GTTGGACTTC TACAAAAGCT GAAGACATTC TTTTTTTTTA AACAAGTAAA CTGATACTAG 
AAGAGGCTGT TTCCGTCAAA TGAGAAGGAA TCTGTAACAC TGGCCCGGGG GGGGTGGGGA 
ATGGGGGAAA TCAGTCCTTT TTACATCTCT TTATTTTCTC TTGTCATGGA ACAGTTTTGT 
GAGTGACAGT TTCCTAAGGG TCCGTCCATC CACCCTCCAA TGGCATCATT GTTTCATACA 
TATCATATGC ACAAGACTTA TAGTGATGTC CTCACTCGAT GCCAATGATC TTTCCCCAGA 
AGACTTCCCA AGTACAGTAT GTAGTAGATT TTGATTACAA ATGCTGACGT GTACCTTTAT 
TTTTCGGTTG TCGTTGTTGG GAGATTCGTC CTTTTACCTT GCTTTGTTAA CACCAATTTG 
TGAGTTTGGG GTTGGAATTT TTTTGGTCGA TTGGGGTTGT TTTTTTTTTT TTTTTTTTTT 
AACCG 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1011 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Pro Gly Pro Glu Arg Thr Met Gly Pro Leu Trp Phe Cys Cys Leu 
1 5 .10 15 

Pro Leu Ala Leu Leu Pro Leu Leu Ala Ala Val Glu Glu Thr Leu Met 
20 25 30 

Asp Ser Thr Thr Ala Thr Ala Glu Leu Gly Trp Met Val His Pro Pro 
35 40 * 45 

Ser Gly Trp Glu Glu Val Ser Gly Tyr Asp Glu Asn Met Asn Thr lie 
50 55 60 

Arg Thr Tyr Gin Val Cys Asn Val Phe Glu Ser Ser Gin Asn Asn Trp 
65 70 75 80 

Leu Arg Thr Lys Tyr lie Arg Arg Arg Gly Ala His Arg He His Val 
85 90 95 

Glu Met Lys Phe Ser Val Arg Asp Cys Ser Ser He Pro Asn Val Pro 
100 105 110 

Gly Ser Cys Lys Glu Thr Phe Asn Leu Tyr Tyr Tyr Glu Ser Asp Phe 



Asp Ser Ala Thr Lys Thr Phe Pro Asn Trp Met Glu Asn Pro Trp Met 

130 135 * 140 

Lys Val Asp Thr He Ala Ala Asp Glu Ser Phe Ser Gin Val Asp Leu 

145 150 155 160 

Gly Gly Arg Val Met Lys He Asn Thr Glu Val Arg Ser Phe Gly Pro 

165 170 175 

Val Ser Lys Asn Gly Phe Tyr Leu Ala Phe Gin Asp Tyr Gly Gly Cys 



3612 
3672 
3732 
3792 
3852 
3912 
3972 
4032 
4092 
4097 



115 



120 



125 



1B0 



185 



190 
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Met Ser Leu lie Ala Val Arg Val Phe Tyr Arg Lys Cys Pro Arg Val 
195 200 205 

lie Gin Asn Gly Ala Val Phe Gin Glu Thr Leu Ser Gly Ala Glu Ser 
210 215 220 

Thr Ser Leu Val Ala Ala Arg Gly Thr Cys lie Ser Asn Ala Glu Glu 
225 230 235 240 

Val Asp Val Pro lie Lys Leu Tyr Cys Asn Gly Asp Gly Glu Trp Leu 
245 250 255 

Val Pro He Gly Arg Cys Met Cys Arg Pro Gly Tyr Glu Ser Val Glu 
260 265 270 

Asn Gly Thr Val Cys Arg Gly Cys Pro Ser Gly Thr Phe Lys Ala Ser 
275 280 285 

Gin Gly Asp Glu Gly Cys Val His Cys Pro He Asn Ser Arg Thr Thr 
290 295 300 

Ser Glu Gly Ala Thr Asn Cys Val Cys Arg Asn Gly Tyr Tyr Arg Ala 
305 310 315 320 

Asp Ala Asp Pro Val Asp Met Pro Cys Thr Thr He Pro Ser Ala Pro 
325 330 335 

Gin Ala Val He Ser Ser Val Asn Glu Thr Ser Leu Met Leu Glu Trp 
340 345 350 

Thr Pro Pro Arg Asp Ser Gly Gly Arg Glu Asp Leu Val Tyr Asn He 
355 360 365 

He Cys Lys Ser Cys Gly Ser Gly Arg Gly Ala Cys Thr Arg Cys Gly 
370 375 380 

Asp Asn Val Gin Phe Ala Pro Arg Gin Leu Gly Leu Thr Glu Pro Arg 
385 390 395 400 

He Tyr He Ser Asp Leu Leu Ala His Thr Gin Tyr Thr Phe Glu He 
405 410 415 

Gin Ala Val Asn Gly Val Thr Asp Gin Ser Pro Phe Ser Pro Gin Phe 
420 " 425 430 

Ala Ser Val Asn He Thr Thr Asn Gin Ala Ala Pro Ser Ala Val Ser 
435 440 445 

He Met His Gin Val Ser Arg Thr Val Asp Ser He Thr Leu Ser Trp 
450 455 460 

Ser Gin Pro Asp Gin Pro Asn Gly Val He Leu Asp Tyr Glu Leu Gin 
465 470 475 480 

Tyr Tyr Glu Lys Asn Leu Ser Glu Leu Asn Ser Thr Ala Val Lys Ser 
485 490 495 

Pro Thr Asn Thr Val Thr Val Gin Asn Leu Lys Ala Gly Thr He Tyr 
500 505 510 

Val Phe Gin Val Arg Ala Arg Thr Val Ala Gly Tyr Gly Arg Tyr Ser 
515 520 ^ 525 

Gly Lys Met Tyr Phe Gin Thr Met Thr Glu Ala Glu Tyr Gin Thr Ser 
530 535 540 
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Val Gin Glu Lys Leu Pro Leu lie lie Gly Ser Ser Ala Ala Gly Leu 
545 550 555 560 

Val Phe Leu He Ala Val Val Val lie He He Val Cys Asn Arg Arg 
565 570 575 

Arg Gly Phe Glu Arg Ala Asp Ser Glu Tyr Thr Asp Lys Leu Gin His 
580 585 590 

Tyr Thr Ser Gly His Ser Thr Tyr Arg Gly Pro Pro Pro Gly Leu Gly 
595 600 605 

Val Arg Ser Leu Phe Val Thr Pro Gly Met Lys He Tyr He Asp Pro 
610 615 620 

Phe Thr Tyr Glu Asp Pro Asn Glu Ala Val Arg Glu Phe Ala Lys Glu 
625 630 635 640 

He Asp He Ser Cys Val Lys He Glu Gin Val He Gly Ala Gly Glu 
645 650 655 

Phe Gly Glu Val Cys Ser Gly His Leu Lys Leu Pro Gly Lys Arg Glu 
660 665 670 

He Phe Val Ala He Lys Thr Leu Lys Ser Gly Tyr Thr Glu Lys Gin 
675 680 685 

Arg Arg Asp Phe Leu Ser Glu Ala Ser He Met Gly Gin Phe Asp His 
690 695 700 

Pro Asn Val He His Leu Glu Gly Val Val Thr Lys Ser Ser Pro Val 
705 710 715 720 

Met He He Thr Glu Phe Met Glu Asn Gly Ser Leu Asp Ser Phe Leu 



Arg Gin Asn Asp Gly Gin Phe Thr Val He Gin Leu Val Gly Met Leu 
740 745 750 

Arg Gly He Ala Ala Gly Met Lys Tyr Leu Ala Asp Met Asn Tyr Val 
755 760 765 

His Arg Asp Leu Ala Ala Arg Asn He Leu Val Asn Ser Asn Leu Val 
770 775 780" 

Cys Lys Val Ser Asp Phe Gly Leu Ser Arg Phe Leu Glu Asp Asp Thr 
785 790 795 BOO 

Ser Asp Pro Thr Tyr Thr Ser Ala Leu Gly Gly Lys He Pro lie Arg 
805 810 815 

Trp Thr Ala Pro Glu Ala He Gin Tyr Arg Lys Phe Thr Ser Ala Ser 
820 825 830 

Asp Val Trp Ser Tyr Gly He Val Met Trp Glu Val Met Ser Tyr Gly 
835 840 845 

Glu Arg Pro Tyr Trp Asp Met Thr Asn Gin Asp Val He Asn Ala He 
850 855 860 

Glu Gin Asp Tyr Arg Leu Pro Pro Pro Met Asp Cys Pro Asn Ala Leu 
865 870 875 880 

His Gin Leu Met Leu Asp Cys Trp Gin Lys Asp Arg Asn His Arg Pro 



725 



730 



735 



885 



890 



895 
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Lys Phe Gly Gin lie Val Asn Thr Leu Asp Lys Met lie Arg Asn Pro 
900 905 910 

Asn Ser Leu Lys Ala Met Ala Pro Leu Ser Ser Gly Val Asn Leu Pro 
915 920 925 

Leu Leu Asp Arg Thr lie Pro Asp Tyr Thr Ser Phe Asn Thr Val Asp 
930 935 940 

Glu Trp Leu Asp Ala lie Lys Met Ser Gin Tyr Lys Glu Ser Phe Ala 
945 950 955 960 

Ser Ala Gly Phe Thr Thr Phe Asp lie Val Ser Gin Met Thr Val Glu 
965 970 975 

Asp lie Leu Arg Val Gly Val Thr Leu Ala Gly His Gin Lys Lys lie 
980 985 990 

Leu Asn Ser He Gin Val Met Arg Ala Gin Met Asn Gin He Gin Ser 
995 1000 1005 

Val Glu Val 
1010 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3591 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : both 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 2.. 2965 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

C GGG GTC TCC TCG AGG GCG CGG CGG CCG CCG GGC AGC AGC AGG AGC 46 
Gly Val Ser Ser Arg Ala Arg Arg Pro Pro Gly Ser Ser Arg Ser 
1 5 10 15 

AGC AGG AGG GGG GTG ACC TCG GAG CTG GCA TGG ACA ACC CAT CCG GAG 94 
Ser Arg Arg Gly Val Thr Ser Glu Leu Ala Trp Thr Thr His Pro Glu 
20 25 30 

ACG GGG TGG GAA GAG GTC AGT GGT TAC GAC GAG GOT ATG AAC CCC ATC 142 
Thr Gly Trp Glu Glu Val Ser Gly Tyr Asp Glu Ala Met Asn Pro He 
35 40 45 

CGC ACA TAC GAG GTG TGC AAC GTG CGG GAG GCC AAC CAG AAC AAC TGG 190 
Arg Thr Tyr Gin Val Cys Asn Val Arg Glu Ala Asn Gin Asn Asn Trp 
50 55 60 

CTT CGC ACC AAG TTC ATT CAG CGC CAG GAC GTC CAG CGT GTC TAC GTG 238 
Leu Arg Thr Lys Phe He Gin Arg Gin Asp Val Gin Arg Val Tyr Val 
65 70 75 

GAG CTG AAA TTC ACT GTG CGG GAC TGC AAC AGC ATC CCC AAC ATC CCT 286 
Glu Leu Lys Phe Thr Val Arg Asp Cys Asn Ser He Pro Asn He Pro 
80 85 90 95 



GGT TCC TGC AAA GAG ACC TTC AAC CTC TTC TAT TAT GAG TCA GAT ACG 334 
Gly Ser Cys Lys Glu Thr Phe Asn Leu Phe Tyr Tyr Glu Ser Asp Thr 
100 105 110 
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GAT TCT GCC TCT GCC AAT AGC CCT TTC TGG ATG GAG AAC CCC TAT ATC 382 
Asp Ser Ala Ser Ala Asn Ser Pro Phe Trp Met Glu Asn Pro Tyr He 
115 120 125 

AAA GTG GAT ACA ATT GCT CCG GAT GAG AGC TTC TCC AAA CTG GAG TCC 430 
Lys Val Asp Thr He Ala Pro Asp Glu Ser Phe Ser Lys Leu Glu Ser 
130 135 140 

GGC CGT GTG AAC ACC AAG GTG CGC AGC TTT GGG CCG CTC TCC AAG AAT 478 
Gly Arg Val Asn Thr Lys Val Arg Ser Phe Gly Pro Leu Ser Lys Asn 
145 * 150 155 

GGC TTT TAT CTG GCT TTC CAG GAC CTG GGG GCC TGC ATG TCC CTT ATC 526 
Gly Phe Tyr Leu Ala Phe Gin Asp Leu Gly Ala Cys Met Ser Leu He 
160 165 170 175 

TCC GTC CGG GCT TTC TAC AAG AAA TGT TCC AAC ACC ATC GCT GGC TTT 574 
Ser Val Arg Ala Phe Tyr Lys Lys Cys Ser Asn Thr He Ala Gly Phe 
180 185 190 

GCT ATC TTC CCG GAG ACC CTA ACG GGG GCT GAG CCC ACG TCG CTG GTC 622 
Ala He Phe Pro Glu Thr Leu Thr Gly Ala Glu Pro Thr Ser Leu Val 
195 200 205 

ATT GCG CCG GGC ACC TGC ATC CCC AAC GCA GTG GAA GTG TCT GTG CCC 670 
He Ala Pro Gly Thr Cys He Pro Asn Ala Val Glu Val Ser Val Pro 
210 215 220 

CTG AAG CTG TAC TGC AAC GGT GAT GGC GAG TGG ATG GTG CCT GTG GGA 718 
Leu Lys Leu Tyr Cys Asn Gly Asp Gly Glu Trp Met Val Pro Val Gly 
225 230 235 

GCG TGC ACG TGT GCT GCT GGG TAC GAG CCA GCC ATG AAG GAT ACC CAG 766 
Ala Cys Thr Cys Ala Ala Gly Tyr Glu Pro Ala Met Lys Asp Thr Gin 
240 245 250 255 

TGC CAA GCA TGC GGC CCG GGG ACG TTC AAA TCC AAG CAG GGC GAG GGC 814 
Cys Gin Ala Cys Gly Pro Gly Thr Phe Lys Ser Lys Gin Gly Glu Gly 
260 265 270 

CCC TGC TCC CCC TGC CCT CCC AAC AGC CGC ACC ACC GCG GGG GCA GCC 862 
Pro Cys Ser Pro Cys Pro Pro Asn Ser Arg Thr Thr Ala Gly Ala Ala 
275 280 285 

ACA GTC TGC ATA TGT CGC AGC GGC TTC TTC CGA GCA GAC GCG GAC CCC 910 
Thr Val Cys He Cys Arg Ser Gly Phe Phe Arg Ala Asp Ala Asp Pro 
290 295 300 

GCA GAC AGC GCC TGC ACC AGT GTG CCC TCA GCC CCA CGC AGC GTC ATC 958 
Ala Asp Ser Ala Cys Thr Ser Val Pro Ser Ala Pro Arg Ser Val He 
305 310 315 

TCC AAC GTG AAT GAG ACG TCG TTG GTG CTG GAG TGG AGC GAG CCG CAG 1006 
Ser Asn Val Asn Glu Thr Ser Leu Val Leu Glu Trp Ser Glu Pro Gin 
320 325 330 335 

GAC GCG GGC GGG CGG GAT GAC CTG CTC TAC AAC GTC ATC TGC AAG AAG 1054 
Asp Ala Gly Gly Arg Asp Asp Leu Leu Tyr Asn Val He Cys Lys Lys 
340 345 " 350 

TGC AGC GTG GAG CGG CGG CTG TGC AGC CGC TGC GAC GAC AAC GTG GAG 1102 
Cys Ser Val Glu Arg Arg Leu Cys Ser Arg Cys Asp Asp Asn Val Glu 
355 360 365 

TTC GTG CCG CGC CAG CTG GGC CTC ACT GGC CTC ACT GAG CGA CGC ATC 1150 
Phe Val Pro Arg Gin Leu Gly Leu Thr Gly Leu Thr Glu Arg Arg He 
370 375 380 
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TAC ATC AGC AAG GTG ATG GCC CAC CCC CAG TAC ACC TTC GAG ATC CAG 1196 
Tyr lie Ser Lys Val Met Ala His Pro Gin Tyr Thr Phe Glu He Gin 
385 390 395 

GCG GTG AAT GGC ATC TCC AGC AAG AGC CCC TAC CCT CCC CAT TTT GCC 1246 
Ala Val Asn Gly He Ser Ser Lys Ser Pro Tyr Pro Pro His Phe Ala 
400 405 410 415 

TCC GTC AAC ATC ACG ACC AAC CAG GCA GCC CCA TCT GCC GTG CCC ACC 1294 
Ser Val Asn He Thr Thr Asn Gin Ala Ala Pro Ser Ala Val Pro Thr 
420 425 430 

ATG CAT CTG CAC AGC AGC ACC GGG AAC AGC ATG ACA CTG TCA TGG ACT 1342 
Met His Leu His Ser Ser Thr Gly Asn Ser Met Thr Leu Ser Trp Thr 
435 440 445 

CCC CCG GAA AGG CCC AAC GGC ATC ATT CTC GAC TAT GAA ATC AAG TAC 1390 
Pro Pro Glu Arg Pro Asn Gly He He Leu Asp Tyr Glu He Lys Tyr 
450 455 460 

TCC GAG AAG CAA GGC CAG GGT GAC GGC ATT GCC AAC ACT GTC ACC AGC 1438 
Ser Glu Lys Gin Gly Gin Gly Asp Gly He Ala Asn Thr Val Thr Ser 
465 470 475 

CAG AAG AAC TCG GTG CGG CTG GAC GGA CTG AAG GCC AAT GCT CGG TAC 1486 
Gin Lys Asn Ser Val Arg Leu Asp Gly Leu Lys Ala Asn Ala Arg Tyr 
480 485 490 495 

ATG GTG CAG GTC CGG GCG CGC ACA GTG GCT GGA TAC GGC CGC TAC AGC 1534 
Met Val Gin Val Arg Ala Arg Thr Val Ala Gly Tyr Gly Arg Tyr Ser 
500 505 510 

CTC CCC ACC GAG TTC CAG ACG ACT GCG GAG GAT GGC TCC ACC AGC AAG 1582 
Leu Pro Thr Glu Phe Gin Thr Thr Ala Glu Asp Gly Ser Thr Ser Lys , 
515 520 525 

ACT TTC CAG GAG CTT CCT CTC ATC GTG GGT TCA GCC ACC GCG GGA CTG 1630 
Thr Phe Gin Glu Leu Pro Leu He Val Gly Ser Ala Thr Ala Gly Leu 
530 535 540 

CTG TTT GTC ATC GTG GTG GTC ATC ATC GCT ATT GTC TGC TTC AGG AAA 1678 
Leu Phe Val He Val Val Val He He Ala He Val Cys Phe Arg Lys 
545 550 555 

GGG ATG GTT ACT GAA CAA CTC CTC TCG TCT CCT TTG GGC AGG AAG CAG 1726 
Gly Met Val Thr Glu Gin Leu Leu Ser Ser Pro Leu Gly Arg Lys Gin 
560 565 570 575 

CGC AAC AGC ACA GAT CCC GAG TAC ACA GAG AAG CTG CAG CAA TAT GTC 1774 
Arg Asn Ser Thr Asp Pro Glu Tyr Thr Glu Lys Leu Gin Gin Tyr Val 
580 585 590 

ACT CCT GGG ATG AAG GTC TAC ATT GAC CCC TTC ACC TAT GAA GAC CCA 1822 
Thr Pro Gly Met Lys Val Tyr He Asp Pro Phe Thr Tyr Glu Asp Pro 
595 600 605 

AAT GAA GCT GTC CGG GAA TTC GCC AAA GAG ATT GAT ATC TCC TGT GTC 1870 
Asn Glu Ala Val Arg Glu Phe Ala Lys Glu He Asp He Ser Cys Val 
610 615 620 

AAA ATT GAG GAG GTC ATT GGA GCA GGA GAG TTT GGT GAG GTG TGC CGT 1918 
Lys He Glu Glu Val He Gly Ala Gly Glu Phe Gly Glu Val Cys Arg 
625 630 635 

GGG CGC CTG AAG CTG CCT GGC CGC CGT GAG ATC TTT GTG GCC ATC AAG 1966 
Gly Arg Leu Lys Leu Pro Gly Arg Arg Glu He Phe Val Ala He Lys 
640 645 650 655 
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ACA CTG AAG GTG GGC TAC ACA GAG AGG CAG CGG CGG GAC TTC CTG AGT 2014 
Thr Leu Lys Val Gly Tyr Thr Glu Arg Gin Arg Arg Asp Phe Leu Ser 
650 665 670 

GAG GCC AGC ATC ATG GGC CAG TTC GAC CAC CCC AAC ATC ATC CAC CTG 2062 
Glu Ala Ser lie Met Gly Gin Phe Asp His Pro Asn He He His Leu 
675 680 685 

GAG GGC GTG GTG ACC AAG AGC CGC CCT GTC ATG ATC ATC ACA GAG TTC 2110 
Glu Gly Val Val Thr Lys Ser Arg Pro Val Met He He Thr Glu Phe 
690 695 700 

ATG GAG AAC TGC GOT CTC GAC TCC TTC CTC CGG CTG AAT GAT GGG CAG 2158 
Met Glu Asn Cys Ala Leu Asp Ser Phe Leu Arg Leu Asn Asp Gly Gin 
705 710 715 

TTC ACG GTC ATC CAG CTG GTG GGG ATG CTG CGA GGC ATC GCT GCT GGC 2206 
Phe Thr Val He Gin Leu Val Gly Met Leu Arg Gly He Ala Ala Gly 
720 725 730 735 

ATG AAG TAC CTC TCA GAG ATG AAC TAC GTG CAC CGA GAC CTG GCT GCC 2254 
Met Lys Tyr Leu Ser Glu Met Asn Tyr Val His Arg Asp Leu Ala Ala 
740 745 750 

CGC AAC ATC CTG GTC AAC AGC AAC TTG GTC TGC AAA GTG TCT GAC TTC 2302 
Arg Asn He Leu Val Asn Ser Asn Leu Val Cys Lys Val Ser Asp Phe 
755 760 765 

GGG CTC TCC CGC TTT TTG GAG GAT GAT CCA GCC GAC CCC ACC TAC ACC 2350 
Gly Leu Ser Arg Phe Leu Glu Asp Asp Pro Ala Asp Pro Thr Tyr Thr 
770 775 780 

AGC TCC CTG GGA GGC AAG ATC CCC ATC AGG TGG ACA GCT CCT GAG GCC 2398 
Ser Ser Leu Gly Gly Lys He Pro He Arg Trp Thr Ala Pro Glu Ala 
785 790 795 

ATC GCC TAC CGC AAA TTC ACG TCG GCC AGC GAC GTG TGG AGC TAC GGC 2446 
He Ala Tyr Arg Lys Phe Thr Ser Ala Ser Asp Val Trp Ser Tyr Gly 
800 805 810 815 

ATC GTC ATG TGG GAA GTG ATG TCC TAC GGG GAG CGA CCC TAC TGG GAC 2494 
He Val Met Trp Glu Val Met Ser Tyr Gly Glu Arg Pro Tyr Trp Asp 
820 825 830 

ATG TCC AAC CAG GAT GTG ATC AAC GCG GTG GAG CAG GAT TAC CGC CTG 2542 
Met Ser Asn Gin Asp Val He Asn Ala Val Glu Gin Asp Tyr Arg Leu 
835 840 845 

CCA CCC CCC ATG GAC TGC CCC ACA GCA CTG CAC CAG CTG ATG CTG GAC 2590 
Pro Pro Pro Met Asp Cys Pro Thr Ala Leu His Gin Leu Met Leu Asp 
850 855 860 

TGC TGG GTG CGG GAC CGC AAC CTG CGG CCC AAG TTT GCA CAG ATT GTC 2638 
Cys Trp Val Arg Asp Arg Asn Leu Arg Pro Lys Phe Ala Gin He Val 
865 870 875 

AAC ACG CTG GAC AAG CTG ATC CGC AAT GCT GCC AGC CTG AAG GTC ATC 2686 
Asn Thr Leu Asp Lys Leu He Arg Asn Ala Ala Ser Leu Lys Val He 
880 885 890 895 

GCC AGC GTC CAG TCC GGT GTC TCC CAG CCG CTC CTG GAC CGC ACC GTG 2734 
Ala Ser Val Gin Ser Gly Val Ser Gin Pro Leu Leu Asp Arg Thr Val 
900 905 910 

CCC GAT TAC ACC ACC TTC ACC ACC GTG GGA GAC TGG CTG GAT GCC ATC 2782 
Pro Asp Tyr Thr Thr Phe Thr Thr Val Gly Asp Trp Leu Asp Ala He 
915 920 ' 925 
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AAA ATG GGA CGG TAC AAG GAG AAC TTC GTC AAC GCC GGC TTC GCC TCC 2830 
Lys Met Gly Arg Tyr Lys Glu Asn Phe Val Asn Ala Gly Phe Ala Ser 
930 935 940 

TTT GAC CTG GTG GCA CAG ATG ACA GCA GAG GAC CTG CTA AGG ATA GGA 2878 
Phe Asp Leu Val Ala Gin Met Thr Ala Glu Asp Leu Leu Arg lie Gly 
945 950 955 

GTG ACG CTA GCA GGG CAC CAG AAG AAG ATC CTG AGC AGC ATT CAG GAC 2926 
Val Thr Leu Ala Gly His Gin Lys Lys He Leu Ser Ser He Gin Asp 
960 965 970 975 

ATG AGG CTG CAG ATG AAC CAG ACG CTG CCG GTT CAG GTT TGACCGCAGG 2975 
Met Arg Leu Gin Met Asn Gin Thr Leu Pro Val Gin Val 





980 




985 








GACTCTGCAT 


TGGAACGGAC 


TGAGGGAACC 


TGCCAACCAG 


GTTCTGTTTG 


CGGTGCAGCC 


3035 


CGGCTTCCCG 


ATTTCCCCTT 


CCCGTGGCGC 


TCCTCTGCCT 


CGGACGCTCG 


CCGGGGACAG 


3095 


GCTGGGCCGG 


GCCACCCTTC 


CCTGGATCAG 


AGGCACTCGT 


GCCGGGAGGG 


AGCCCGGCTT 


3155 


TTCGTCCCGT 


GTCCCGCAGC 


GGCGAGGCAG 


TGAACGCAGT 


CTTCATATTG 


AAGATGGATT 


3215 


ATGGGACGGA 


GATGGCGCAT 


CCGCTTCCCG 


CCCTGTCTCA 


GTGCTCATCA 


GTTTGAAGAG 


3275 


ATGTTCTGCT 


TCTTGGATTT 


CTTTACACCC 


CGGTTTTCCC 


CCCTCGAGTC 


CTCACTTCCC 


3335 


CCTATCCCTG 


AGGCCACAGA 


CTGTTGACCC 


GTCCGCTGAG 


TCCGTCAGAC 


GCTCCGAAGC 


3395 


CTTCCCCGAG 


CCCGGTCCCC 


GCGTGGAGAC 


GGCGCCAGGG 


ACGGGGCTAC 


GGCCCCAGAC 


3455 


AATCACTCCA 


CCCCTCCGCA 


CGAGGGTCCT 


CACTGGGACG 


TGTCTGAAGG 


GGAAAGGCTC 


3515 


TGCTCCCTTT 


TTGGCTTTGC 


ACGCCAGAAC 


CCGAACCCCG 


TGAGATTTAC 


TATGCAGGGA 


3575 


GTTAGGCAAA 


AAAAAG 










3591 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 988 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Gly Val Ser Ser Arg Ala Arg Arg Pro Pro Gly Ser Ser Arg Ser Ser 
15 10 15 

Arg Arg Gly Val Thr Ser Glu Leu Ala Trp Thr Thr His Pro Glu Thr 
20 25 30 

Gly Trp Glu Glu Val Ser Gly Tyr Asp Glu Ala Met Asn Pro He Arg 
35 40 45 

Thr Tyr Gin Val Cys Asn Val Arg Glu Ala Asn Gin Asn Asn Trp Leu 
50 55 60 

Arg Thr Lys Phe He Gin Arg Gin Asp Val Gin Arg Val Tyr Val Glu 
65 70 75 80 

Leu Lys Phe Thr Val Arg Asp Cys Asn Ser He Pro Asn He Pro Gly 
85 90 95 
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Ser Cys Lys Glu Thr Phe Asn Leu Phe Tyr Tyr Glu Ser Asp Thr Asp 
100 105 110 

Ser Ala Ser Ala Asn Ser Pro Phe Trp Met Glu Asn Pro Tyr lie Lys 
115 120 125 

Val Asp Thr lie Ala Pro Asp Glu Ser Phe Ser Lys Leu Glu Ser Gly 
130 135 140 

Arg Val Asn Thr Lys Val Arg Ser Phe Gly Pro Leu Ser Lys Asn Gly 
145 150 155 160 

Phe Tyr Leu Ala Phe Gin Asp Leu Gly Ala Cys Met Ser Leu lie Ser 
165 170 175 

Val Arg Ala Phe Tyr Lys Lys Cys Ser Asn Thr lie Ala Gly Phe Ala 
180 185 190 

lie Phe Pro Glu Thr Leu Thr Gly Ala Glu Pro Thr Ser Leu Val lie 
195 200 205 

Ala Pro Gly Thr Cys lie Pro Asn Ala Val Glu Val Ser Val Pro Leu 
210 215 220 

Lys Leu Tyr Cys Asn Gly Asp Gly Glu Trp Met Val Pro Val Gly Ala 
225 230 235 240 

Cys Thr Cys Ala Ala Gly Tyr Glu Pro Ala Met Lys Asp Thr Gin Cys 
245 250 255 

Gin Ala Cys Gly Pro Gly Thr Phe Lys Ser Lys Gin Gly Glu Gly Pro 
260 265 270 

Cys Ser Pro Cys Pro Pro Asn Ser Arg Thr Thr Ala Gly Ala Ala Thr 
275 280 285 

Val Cys lie Cys Arg Ser Gly Phe Phe Arg Ala Asp Ala Asp Pro Ala 
290 295 300 

Asp Ser Ala Cys Thr Ser Val Pro Ser Ala Pro Arg Ser Val lie Ser 
305 310 315 "* 320 

Asn Val Asn Glu Thr Ser Leu Val Leu Glu Trp Ser Glu Pro Gin Asp 
325 330 335 

Ala Gly Gly Arg Asp Asp Leu Leu Tyr Asn Val lie Cys Lys Lys Cys 
340 345 350 

Ser Val Glu Arg Arg Leu Cys Ser Arg Cys Asp Asp Asn Val Glu Phe 
355 360 365 

Val Pro Arg Gin Leu Gly Leu Thr Gly Leu Thr Glu Arg Arg lie Tyr 
370 375 380 

He Ser Lys Val Met Ala His Pro Gin Tyr Thr Phe Glu He Gin Ala 
385 390 395 400 

Val Asn Gly He Ser Ser Lys Ser Pro Tyr Pro Pro His Phe Ala Ser 
405 410 415 

Val Asn He Thr Thr Asn Gin Ala Ala Pro Ser Ala Val Pro Thr Met 
420 425 430 

His Leu His Ser Ser Thr Gly Asn Ser Met Thr Leu Ser Trp Thr Pro 
435 440 445 
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Pro Glu Arg Pro Asn Gly He He Leu Asp Tyx Glu He Lys Tyr Ser 
450 455 460 

Glu Lys Gin Gly Gin Gly Asp Gly He Ala Asn Thr Val Thr Ser Gin 
465 470 475 480 

Lys Asn Ser Val Arg Leu Asp Gly Leu Lys Ala Asn Ala Arg Tyr Met 
485 490 495 

Val Gin Val Arg Ala Arg Thr Val Ala Gly Tyr Gly Arg Tyr Ser Leu 
500 505 510 

Pro Thr Glu Phe Gin Thr Thr Ala Glu Asp Gly Ser Thr Ser Lys Thr 
515 520 525 

Phe Gin Glu Leu Pro Leu He Val Gly Ser Ala Thr Ala Gly Leu Leu 
530 535 540 

Phe Val He Val Val Val He He Ala He Val Cys Phe Arg Lys Gly 
545 550 555 560 

Met Val Thr Glu Gin Leu Leu Ser Ser Pro Leu Gly Arg Lys Gin Arg 
565 570 575 

Asn Ser Thr Asp Pro Glu Tyr Thr Glu Lys Leu Gin Gin Tyr Val Thr 
580 585 590 

Pro Gly Met Lys Val Tyr He Asp Pro Phe Thr Tyr Glu Asp Pro Asn 
595 600 605 

Glu Ala Val Arg Glu Phe Ala Lys Glu He Asp He Ser Cys Val Lys 
610 615 620 

He Glu Glu Val He Gly Ala . Gly Glu Phe Gly Glu Va'l Cys Arg Gly 
625 630 635 640 

Arg Leu Lys Leu Pro Gly Arg Arg Glu He Phe Val Ala He Lys Thr 
645 650 655 

Leu Lys Val Gly Tyr Thr Glu Arg Gin Arg Arg Asp Phe Leu Ser Glu 
660 665 " 670 

Ala Ser He Met Gly Gin Phe Asp His Pro Asn He He His Leu Glu 
675 680 685 

Gly Val Val Thr Lys Ser Arg Pro Val Met He He Thr Glu Phe Met 
690 695 700 

Glu Asn Cys Ala Leu Asp Ser Phe Leu Arg Leu Asn Asp Gly Gin Phe 
705 710 715 720 

Thr Val He Gin Leu Val Gly Met Leu Arg Gly He Ala Ala Gly Met 



Lys Tyr Leu Ser Glu Met Asn Tyr Val His Arg Asp Leu Ala Ala Arg 
74 0 745 750 

Asn He Leu Val Asn Ser Asn Leu Val Cys Lys Val Ser Asp Phe Gly 
755 760 765 

Leu Ser Arg Phe Leu Glu Asp Asp Pro Ala Asp Pro Thr Tyr Thr Ser 
770 775 780 

Ser Leu Gly Gly Lys He Pro He Arg Trp Thr Ala Pro Glu Ala He 



725 



730 



735 



785 



790 



795 



800 
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Ala Tyr Arg Lys Phe Thr Ser Ala Ser Asp Val Trp Ser Tyr Gly lie 
805 810 815 

Val Met Trp Glu Val Met Ser Tyr Gly Glu Arg Pro Tyr Trp Asp Met 
820 825 830 

Ser Asn Gin Asp Val lie Asn Ala Val Glu Gin Asp Tyr Arg Leu Pro 
835 840 845 

Pro Pro Met Asp Cys Pro Thr Ala Leu His Gin Leu Met Leu Asp Cys 
850 855 860 

Trp Val Arg Asp Arg Asn Leu Arg Pro Lys Phe Ala Gin lie Val Asn 
865 870 875 880 

Thr Leu Asp Lys Leu lie Arg Asn Ala Ala Ser Leu Lys Val lie Ala 
885 890 895 

Ser Val Gin Ser Gly Val Ser Gin Pro Leu Leu Asp Arg Thr Val Pro 
900 905 910 

Asp Tyr Thr Thr Phe Thr Thr Val Gly Asp Trp Leu Asp Ala He Lys 
915 920 925 

Met Gly Arg Tyr Lys Glu Asn Phe Val Asn Ala Gly Phe Ala Ser Phe 
930 935 940 

Asp Leu Val Ala Gin Met Thr Ala Glu Asp Leu Leu Arg He Gly Val 
945 950 955 960 

Thr Leu Ala Gly His Gin Lys Lys He Leu Ser Ser He Gin Asp Met 
965 970 975 

Arg Leu Gin Met Asn Gin Thr Leu Pro Val Gin Val 
980 985 

(2) INFORMATION* FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3254 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : both 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 32.. 2980 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

CGCTCTGCTC GCGCGCTGCT GCCCCGCCGA C ATG GAC CGC CGC CGC CTG CCG 52 

Met Asp Arg Arg Arg Leu Pro 
1 5 

CTG CTG CTG CTC TGC GCT GCC CTC GGC TCC GCC GGG CGT CTG AGC GCC 100 
Leu Leu Leu Leu Cys Ala Ala Leu Gly Ser Ala Gly Arg Leu Ser Ala 
10 15 20 

CGC CCC GGC AAC GAA GTT AAT CTG CTG GAT TCA AAA ACA ATT CAA GGG 148 
Arg Pro Gly Asn Glu Val Asn Leu Leu Asp Ser Lys Thr He Gin Gly 
25 30 35 

GAG CTG GGC TGG ATC TCC TAC CCA TCA CAT GGG TGG GAA GAG ATT AGT 196 
Glu Leu Gly Trp He Ser Tyr Pro Ser His Gly Trp Glu Glu He Ser 
40 45 50 55 
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GGT GTT GAT GAG CAT TAT ACT CCA ATC AGA ACT TAC CAA GAG AGC AAT 244 
Gly Val Asp Glu His Tyr Thr Pro lie Arg Thr Tyr Gin Glu Ser Asn 
60 65 70 



GTT ATG GAT CAC AGT CAA AAC AAT TGG CTG CGA ACA AAC TGG ATT CCA 292 
Val Met Asp His Ser Gin Asn Asn Trp Leu Arg Thr Asn Trp lie Pro 
75 80 85 

CGC AAT TCA GCG CAG AAG ATA TAT GTG GAG CTC AAG TTT ACC TTG AGG 340 
Arg Asn Ser Ala Gin Lys He Tyr Val Glu Leu Lys Phe Thr Leu Arg 
90 95 100 

GAC TGC AAT AGT ATC CCT CTA GTT CTG GGC ACT TGC AAA GAG ACT TTC 388 
Asp Cys Asn Ser He Pro Leu Val Leu Gly Thr Cys Lys Glu Thr Phe 
105 110 115 

AAT CTG TAT TAC ATG GAA TCC GAT GAT GAC CAT TTG GCA AAG TTC AGA 436 
Asn Leu Tyr Tyr Met Glu Ser Asp Asp Asp His Leu Ala Lys Phe Arg 
120 125 130 135 

GAG CAC CAA TTT ACG AAG ATT GAC ACC ATG GCG GCT GAT GAG AGC TTC 484 
Glu His Gin Phe Thr Lys He Asp Thr Met Ala Ala Asp Glu Ser Phe 
140 145 150 

ACC CAG ATG GAT CTT GGG GAC CGG ATT CTC AAG CTG AAT ACC GAA GTC 532 
Thr Gin Met Asp Leu Gly Asp Arg He Leu Lys Leu Asn Thr Glu Val 
155 160 165 

CGC GAG GTG GGA CCT GTT AGT AAG AAG GGC TTT TAC TTG GCT TTC CAA 580 
Arg Glu Val Gly Pro Val Ser Lys Lys Gly Phe Tyr Leu Ala Phe Gin 
170 175 180 

GAT GTA GGT GCA TGT GTT GCC TTA GTC TCG GTG CGA GTG TAC TTC AAG 628 
Asp Val Gly Ala Cys Val Ala Leu Val Ser Val Arg Val Tyr Phe Lys 
185 190 195 

AAG TGC CCT TTC ACT GTC AAG AAC CTC GCC ATG TTT CCA GAT ACA GTT 676 
Lys Cys Pro Phe Thr Val Lys Asn Leu Ala Met Phe Pro Asp Thr Val 
200 205 210 215 

CCT ATG GAC TCC CAG TCC CTG GTG GAG GTG CGG GGT TCT TGT GTC AAT 724 
Pro Met Asp Ser Gin Ser Leu Val Glu Val Arg Gly Ser Cys Val Asn 
220 225 230 

CAT TCC AAG GAG GAA GAG CCA CCC AAG ATG TAC TGC AGC ACG GAA GGA 772 
His Ser Lys Glu Glu Glu Pro Pro Lys Met Tyr Cys Ser Thr Glu Gly 
235 240 245 

GAA TGG CTA GTG CCC ATA GGG AAG TGC TTG TGT AAT GCT GGC TAT GAA 820 
Glu Trp Leu Val Pro He Gly Lys Cys Leu Cys Asn Ala Gly Tyr Glu 
250 255 260 

GAG AGA GGC TTT GCG TGC CAA GCT TGT CGA CCT GGG TTC TAT AAA GCT 868 
Glu Arg Gly Phe Ala Cys Gin Ala Cys Arg Pro Gly Phe Tyr Lys Ala 
265 270 275 

TCT GCT GGC AAT GTG AAG TGT GCC AAA TGC CCA CCT CAC AGC TCT ACC 916 
Ser Ala Gly Asn Val Lys Cys Ala Lys Cys Pro Pro His Ser Ser Thr 
280 285 290 295 . 

TAT GAA GAT GCA TCT CTG AAC TGC AGG TGT GAA AAG AAT TAC TTT CGC 964 
Tyr Glu Asp Ala Ser Leu Asn Cys Arg Cys Glu Lys Asn Tyr Phe Arg 
300 305 310 

TCT GAG AAA GAC CCT CCA TCC ATG GCT TGC ACC AGA CCA CCA TCT GCT 1012 
Ser Glu Lys Asp Pro Pro Ser Met Ala Cys Thr Arg Pro Pro Ser Ala 
315 320 * 325 



/ 
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CCA AGA AAC GTT ATT TCT AAC ATC AAT GAG ACA TCT GTT ATT CTG GAC 1060 
Pro Arg Asn Val lie Ser Asn lie Asn Glu Thr Ser Val He Leu Asp 
330 335 340 

TGG AGC TGG CCT CTT GAT ACA GGA GGT CGA AAA GAT GTC ACT TTC AAC 1108 
Trp Ser Trp Pro Leu Asp Thr Gly Gly Arg Lys Asp Val Thr Phe Asn 
345 350 355 

ATC ATT TGC AAA AAA TGT GGA GGA AGC AGC AAG ATA TGT GAG CCT TGC 1156 
He He Cys Lys Lys Cys Gly Gly Ser Ser Lys He Cys Glu Pro Cys 
360 365 370 375 

AGT GAC AAC GTA CGG TTC TTA CCC CGT CAG ACT GGC CTC ACC AAC ACC 1204 
Ser Asp Asn Val Arg Phe Leu Pro Arg Gin Thr Gly Leu Thr Asn Thr 
380 385 390 

ACG GTG ACA GTA GTG GAC CTT TTG GCA CAT ACC AAT TAC ACT TTT GAG 1252 
Thr Val Thr Val Val Asp Leu Leu Ala His Thr Asn Tyr Thr Phe Glu 
395 400 405 

ATT GAT GCA GTC AAC GGG GTA TCT GAC TTG AGT ACA CTT TCG AGA CAA 1300 
He Asp Ala Val Asn Gly Val Ser Asp Leu Ser Thr Leu Ser Arg Gin 
410 415 420 

TTT GCT GCT GTC AGC ATC ACG ACT AAT CAG GCT GCG CCA TCC CCC ATC 1348 
Phe Ala Ala Val Ser lie Thr Thr Asn Gin Ala Ala Pro Ser Pro He 
425 430 435 

ACA GTG ATA AGG AAC GAC CGG ACA TCC AGG AAC AGC GTG TCT CTG TCT 1396 
Thr Val He Arg Asn Asp Arg Thr Ser Arg Asn Ser Val Ser Leu Ser 
440 445 450 455 

TGG CAG GAG CCT GAG CAC CCA AAT GGA ATC ATC TTG GAC TAC GAG GTC 1444 
Trp Gin Glu Pro Glu His Pro Asn Gly He He Leu Asp Tyr Glu Val 
460 465 470 

AAA TAC TAC GAA AAG CAG GAA CAA GAG ACA AGC TAT ACT ATT. CTG AGA 1492 
Lys Tyr Tyr Glu Lys Gin Glu Gin Glu Thr Ser Tyr Thr He Leu Arg 
475 480 485 

GCC AAA AGC ACT AAC GTT ACT ATC AGC GGC CTC AAA CCT GAT ACC ACC 1540 
Ala Lys Ser Thr Asn Val Thr lie Ser Gly Leu Lys Pro Asp Thr Thr 
490 495 500 

TAC GTC TTC CAA ATT CGA GCC CGA ACT GCA GCT AGA TAT GGG ACA AGC 1588 
Tyr Val Phe Gin lie Arg Ala Arg Thr Ala Ala Arg Tyr Gly Thr Ser 
505 510 515 

AGC CGC AAG TTT GAA TTT GAA ACC AGT CCA GAT TCA TTC TCC ATT TCC 1636 
Ser Arg Lys Phe Glu Phe Glu Thr Ser Pro Asp Ser Phe Ser lie Ser 
520 525 530 535 

AGT GAA AAT AGC CAG GTC GTT ATG ATT GCC ATT TCA GCT GCA GTT GCC 1684 
Ser Glu Asn Ser Gin Val Val Met lie Ala He Ser Ala Ala Val Ala 
540 545 550 

ATC ATT CTC CTC ACG GTT GTT GTG TAC GTC TTG ATT GGG AGA TTC TGC 1732 
lie lie Leu Leu Thr Val Val Val Tyr Val Leu lie Gly Arg Phe Cys 
555 560 565 

GGA TAC AAG AAG TCT AAA CAT GGT ACC GAT GAG AAA AGA CTA CAT TTT 1780 
Gly Tyr Lys Lys Ser Lys His Gly Thr Asp Glu Lys Arg Leu His Phe 
570 575 580 

GGG AAT GGC CAC TTA AAA CTC CCA GGC CTG AGA ACT TAT GTA GAT CCA 1828 
Gly Asn Gly His Leu Lys Leu Pro Gly Leu Arg Thr Tyr Val Asp Pro 
585 590 595 
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CAT ACG TAC GAA GAT CCC AAT CAA GCT GTA CAT GAA TTT GCC AAG GAA 1876 
His Thr Tyr Glu Asp Pro Asn Gin Ala Val His Glu Phe Ala Lys Glu 
600 605 610 615 

CTA GAT GCT TCT AAT ATA TCA ATT GAT AAA GTT GTT GGA GCA GGG GAA 1924 
Leu Asp Ala Ser Asn lie Ser He Asp Lys Val Val Gly Ala Gly Glu 
620 625 630 

TTT GGA GAA GTG TGC AGT GGG CGC CTG AAG CTG CCT TCT AAA AAG GAA 1972 
Phe Gly Glu Val Cys Ser Gly Arg Leu Lys Leu Pro Ser Lys Lys Glu 
635 640 645 

ATT TCA GTG GCC ATC AAA ACT CTG AAA GCT GGC TAC ACA GAA AAA CAG 2020 
He Ser Val Ala He Lys Thr Leu Lys Ala Gly Tyr Thr Glu Lys Gin 
650 655 660 

AGA AGG GAT TTC CTG GGA GAA GCA AGC ATC ATG GGG CAG TTT GAC CAC 2068 
Arg Arg Asp Phe Leu Gly Glu Ala Ser He Met Gly Gin Phe Asp His 
665 670 675 

CCC AAC ATC ATC CGA CTG GAG GGC GTT GTG ACT AAA AGT AAA CCA GTT 2116 
Pro Asn He He Arg Leu Glu Gly Val Val Thr Lys Ser Lys Pro Val 
680 685 690 695 

ATG ATT GTT ACT GAA TAC ATG GAA AAC GGT TCC TTG GAC AGC TTC CTA 2164 
Met He Val Thr Glu Tyr Met Glu Asn Gly Ser Leu Asp Ser Phe Leu 
700 705 710 

CGG AAA CAT GAT GCC CAG TTC ACA GTC ATT CAG CTA GTA GGC ATG CTT 2212 
Arg Lys His Asp Ala Gin Phe Thr Val He Gin Leu Val Gly Met Leu 
715 720 725 

CGT GGG ATC GCA TCT GGC ATG AAA TAT TTG TCA GAT ATG GGT TAT GTC 2260 
Arg Gly He Ala Ser Gly Met Lys Tyr Leu Ser Asp Met Gly Tyr Val 
730 735 740 

CAC CGA GAT CTA GCT GCT CGT AAT ATA CTC ATC AAT AGT AAC TTG GTG 2308 
His Arg Asp Leu Ala Ala Arg Asn He Leu He Asn Ser Asn Leu Val 
745 750 755 

TGC AAA GTC TCA GAT TTT GGT CTT TCT CGT GTA TTG GAA GAT GAC CCA 2356 
Cys Lys Val Ser Asp Phe Gly Leu Ser Arg Val Leu Glu Asp Asp Pro 
760 765 770 775 

GAA GCT GCT TAC ACA ACA AGG GGG GGC AAG ATT CCC ATC CGA TGG ACG 2404 
Glu Ala Ala Tyr Thr Thr Arg Gly Gly Lys He Pro He Arg Trp Thr 
780 785 790 

TCA CCA GAA GCC ATT GCA TAC CGG AAG TTC ACA TCA GCC AGT GAT GCG 2452 
Ser Pro Glu Ala He Ala Tyr Arg Lys Phe Thr Ser Ala Ser Asp Ala 
795 800 805 

TGG AGC TAT GGG ATT GTC CTC TGG GAG GTG ATG TCT TAT GGA GAA AGG 2500 
Trp Ser Tyr Gly He Val Leu Trp Glu Val Met Ser Tyr Gly Glu Arg 
810 815 820 

CCG TAC TGG GAG ATG TCC TTC CAG GAC GTA ATT AAA GCC GTT GAT GAA 2548 
Pro Tyr Trp Glu Met Ser Phe Gin Asp Val He Lys Ala Val Asp Glu 
825 830 835 

GGG TAT CGC TTG CCA CCT CCT ATG GAC TGC CCA GCT GCC TTG TAT CAG 2596 
Gly Tyr Arg Leu Pro Pro Pro Met Asp Cys Pro Ala Ala Leu Tyr Gin 
840 845 850 855 

CTG ATG CTG GAC TGC TGG CAG AAA GAC AGA AAC AAC AGA CCC AAG TTT 2644 
Leu Met Leu Asp Cys Trp Gin Lys Asp Arg Asn Asn Arg Pro Lys Phe 
860 865 870 
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GAG CAG ATT GTC AGC ATC CTG GAT AAG CTG ATC 
Glu Gin lie Val Ser lie Leu Asp Lys Leu lie 
875 880 

CTG AAA ATA ATC ACC AAT GCG GCA GCA AGG CCA 
Leu Lys lie lie Thr Asn Ala Ala Ala Arg Pro 
890 895 

GAC CAA AGT AAC ATT GAC ATT TCA GCG TTC CGC 
Asp Gin Ser Asn lie Asp lie Ser Ala Phe Arg 
905 910 

CTC AAT GGT TTT CGA ACA GGA CAG TGC AAA GGC 
Leu Asn Gly Phe Arg Thr Gly Gin Cys Lys Gly 
920 925 * 930 

GAG TAC AGC TCC TGT GAT ACA ATA GCC AAG ATT 
Glu Tyr Ser Ser Cys Asp Thr He Ala Lys He 
940 945 

AAG AAA GTT GGT GTT ACA GTT GTG GGG CCT CAA 
Lys Lys Val Gly Val Thr Val Val Gly Pro Gin 
955 960 

AGT ATC AAA ACT CTA GAA ACT CAT ACG AAG AAC 
Ser He Lys Thr Leu Glu Thr His Thr Lys Asn 
970 975 

TAAGGTACCA AAATGATGTT GCTGAGGACA GAAAAAAAAG 

AAAGCGATGG CTGATAAACG GCACGGTTTA AAGGAGTTCT 

TAATGGTTGA AATTTCAAAC CCACTGAGAC ACTCAAATAC 

ATAGGAGCGA ACTTGTTTTC TATCTGTTAA TCCTGAAGGG 

TTAATGCAGA TAGTAAATTT CAAAAAAAAA AACG 



CGT AAT CCC AGC AGT 
Arg Asn Pro Ser Ser 
885 

TCA AAT CTT CTC CTG 
Ser Asn Leu Leu Leu 
900 

ACG GCA GGT GAT TGG 
Thr Ala Gly Asp Trp 
915 

ATT TTC ACG GGT GTG 
He Phe Thr Gly Val 
935 

TCC ACT GAT GAC ATG 
Ser Thr Asp Asp Met 
950 

AAG AAG ATT GTT AGC 
Lys Lys He Val Ser 
965 

AGC CCT GTT CCT GTG 
Ser Pro Val Pro Val 
980 

AAAAGTCGCA TCAAAGTGCA 
TTGCAGCAGT TTTGGAAACA 
TGAGTATAAA TGCCTTAAAA 
TGGGTGCTCT TAACTGACTG 



2692 

2740 

2788 

2836 

2884 

2932 

2980 

3040 
3100 
3160 
3220 
3254 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 983 amino acids 

(B) TYPE : amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Met Asp Arg Arg Arg Leu Pro Leu Leu Leu Leu Cys Ala Ala Leu Gly 
15 10 is 

Ser Ala Gly Arg Leu Ser Ala Arg Pro Gly Asn Glu Val Asn Leu Leu 
20 25 30 

Asp Ser Lys Thr He Gin Gly Glu Leu Gly Trp He Ser Tyr Pro Ser 
35 40 45 

His Gly Trp Glu Glu lie Ser Gly Val Asp Glu His Tyr Thr Pro He 
50 55 60 

Arg Thr Tyr Gin Glu Ser Asn Val Met Asp His Ser Gin Asn Asn Trp 
65 70 75 80 

Leu Arg Thr Asn Trp lie Pro Arg Asn Ser Ala Gin Lys He Tyr Val 
85 90 95 
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Glu Leu Lys Phe Thr Leu Arg Asp Cys Asn Ser lie Pro Leu Val Leu 
100 105 110 

Gly Thr Cys Lys Glu Thr Phe Asn Leu Tyr Tyr Met Glu Ser Asp Asp 
115 120 125 

Asp His Leu Ala Lys Phe Arg Glu His Gin Phe Thr Lys He Asp Thr 
130 135 140 

Met Ala Ala Asp Glu Ser Phe Thr Gin Met Asp Leu Gly Asp Arg He 
145 150 155 160 

Leu Lys Leu Asn Thr Glu Val Arg Glu Val Gly Pro Val Ser Lys Lys 
165 170 175 

Gly Phe Tyr Leu Ala Phe Gin Asp Val Gly Ala Cys Val Ala Leu Val 
180 185 190 

Ser Val Arg Val Tyr Phe Lys Lys Cys Pro Phe Thr Val Lys Asn Leu 
195 200 205 

Ala Met Phe Pro Asp Thr Val Pro Met Asp Ser Gin Ser Leu Val Glu 
210 215 220 

Val Arg Gly Ser Cys Val Asn His Ser Lys Glu Glu Glu Pro Pro Lys 
225 230 235 240 

Met Tyr Cys Ser Thr Glu Gly Glu Trp Leu Val Pro He Gly Lys Cys 



Leu Cys Asn Ala Gly Tyr Glu Glu Arg Gly Phe Ala Cys Gin Ala Cys 
260 265 270 

Arg Pro Gly Phe Tyr Lys Ala Ser Ala Gly Asn Val Lys Cys Ala Lys 
275 280 285 

Cys Pro Pro His Ser Ser Thr Tyr Glu Asp Ala Ser Leu Asn Cys Arg 
290 295 300 

Cys Glu Lys Asn Tyr Phe Arg Ser Glu Lys Asp Pro Pro Ser Met Ala 
305 310 315 320 

Cys Thr Arg Pro Pro Ser Ala Pro Arg Asn Val He Ser Asn He Asn 
325 330 335 

Glu Thr Ser Val lie Leu Asp Trp Ser Trp Pro Leu Asp Thr Gly Gly 
340 345 350 

Arg Lys Asp Val Thr Phe Asn He He Cys Lys Lys Cys Gly Gly Ser 
355 360 365 

Ser Lys He Cys Glu Pro Cys Ser Asp Asn Val Arg Phe Leu Pro Arg 
370 375 380 

Gin Thr Gly Leu Thr Asn Thr Thr Val Thr Val Val Asp Leu Leu Ala 
385 390 395 400 

His Thr Asn Tyr Thr Phe Glu He Asp Ala Val Asn Gly Val Ser Asp 
405 410 415 

Leu Ser Thr Leu Ser Arg Gin Phe Ala Ala Val Ser He Thr Thr Asn 
420 425 430 

Gin Ala Ala Pro Ser Pro He Thr Val He Arg Asn Asp Arg Thr Ser 



245 



250 



255 



435 



440 



445 
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Arg Asn Ser Val Ser Leu Ser Trp Gin Glu Pro Glu His Pro Asn Gly 
450 455 460 

He He Leu Asp Tyr Glu Val Lys Tyr Tyr Glu Lys Gin Glu Gin Glu 
465 470 475 480 

Thr Ser Tyr Thr He Leu Arg Ala Lys Ser Thr Asn Val Thr He Ser 
485 490 495 

Gly Leu Lys Pro Asp Thr Thr Tyr Val Phe Gin He Arg Ala Arg Thr 
500 505 510 

Ala Ala Arg Tyr Gly Thr Ser Ser Arg Lys Phe Glu Phe Glu Thr Ser 
515 520 525 

Pro Asp Ser Phe Ser He Ser Ser Glu Asn Ser Gin Val Val Met He 
530 535 540 

Ala He Ser Ala Ala Val Ala He He Leu Leu Thr Val Val Val Tyr 
545 550 555 560 

Val Leu He Gly Arg Phe Cys Gly Tyr Lys Lys Ser Lys His Gly Thr 
565 570 575 

Asp Glu Lys Arg Leu His Phe Gly Asn Gly His Leu Lys Leu Pro Gly 
580 585 590 

Leu Arg Thr Tyr Val Asp Pro His Thr Tyr Glu Asp Pro Asn Gin Ala 
595 600 605 

Val His Glu Phe Ala Lys Glu Leu Asp Ala Ser Asn He Ser He Asp 
610 615 620 

Lys Val Val Gly Ala Gly Glu Phe Gly Glu Val Cys Ser Gly Arg Leu 
625 630 635 640 

Lys Leu Pro Ser Lys Lys Glu He Ser Val Ala He Lys Thr Leu Lys 
645 650 655 

Ala Gly Tyr Thr Glu Lys Gin Arg Arg Asp Phe Leu Gly Glu Ala Ser 
660 665 670 

He Met Gly Gin Phe Asp His Pro Asn He He Arg Leu Glu Gly Val 
675 680 685 

Val Thr Lys Ser Lys Pro Val Met He Val Thr Glu Tyr Met Glu Asn 
690 695 700 

Gly Ser Leu Asp Ser Phe Leu Arg Lys His Asp Ala Gin Phe Thr Val 
705 710 715 720 

He Gin Leu Val Gly Met Leu Arg Gly He Ala Ser Gly Met Lys Tyr 
725 730 735 

Leu Ser Asp Met Gly Tyr Val His Arg Asp Leu Ala Ala Arg Asn He 
740 745 750 

Leu He Asn Ser Asn Leu Val Cys Lys Val Ser Asp Phe Gly Leu Ser 
755 760 765 

Arg Val Leu Glu Asp Asp Pro Glu Ala Ala Tyr Thr Thr Arg Gly Glv 
770 775 780 

Lys He Pro He Arg Trp Thr Ser Pro Glu Ala lie Ala Tyr Arg Lys 



785 



790 



795 



800 
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Phe Thr Ser Ala Ser Asp Ala Trp Ser Tyr* Gly lie Val Leu Trp Glu 
805 810 815 

Val Met Ser Tyr Gly Glu Arg Pro Tyr Trp Glu Met Ser Phe Gin Asp 
820 825 830 

Val lie Lys Ala Val Asp Glu Gly Tyr Arg Leu Pro Pro Pro Met Asp 
835 840 845 

Cys Pro Ala Ala Leu Tyr Gin Leu Met Leu Asp Cys Trp Gin Lys Asp 
850 855 860 

Arg Asn Asn Arg Pro Lys Phe Glu Gin He Val Ser He Leu Asp Lys 
865 870 875 880 

Leu He Arg Asn Pro Ser Ser Leu Lys He He Thr Asn Ala Ala Ala 
885 890 895 

Arg Pro Ser Asn Leu Leu Leu Asp Gin Ser Asn He Asp He Ser Ala 
900 905 910 

Phe Arg Thr Ala Gly Asp Trp Leu Asn Gly Phe Arg Thr Gly Gin Cys 
915 920 925 

Lys Gly He Phe Thr Gly Val Glu Tyr Ser Ser Cys Asp Thr He Ala 
930 935 940 

Lys He Ser Thr Asp Asp Met Lys Lys Val Gly Val Thr Val Val Gly 
945 950 955 960 

Pro Gin Lys Lys He Val Ser Ser He Lys Thr Leu Glu Thr His Thr 
965 970 975 

Lys Asn Ser Pro Val Pro Val 
980 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4049 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : both 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 10.. 2994 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

CGGCTTCTG ATG CCC GGC CCG GAG CGC ACC ATG GGG CCG TTG TGG TTC 48 
Met Pro Gly Pro Glu Arg Thr Met Gly Pro Leu Trp Phe 
1 5 10 

TGC TGT TTG CCC CTC GCC CTC TTG CCT CTG CTC GCC GCC GTG GAA GAG 96 
Cys Cys Leu Pro Leu Ala Leu Leu Pro Leu Leu Ala Ala Val Glu Glu 
15 20 25 

ACG CTG ATG GAC TCC ACA ACG GCC ACA GCA GAG CTG GGC TGG ATG GTG 144 
Thr Leu Met Asp Ser Thr Thr Ala Thr Ala Glu Leu Gly Trp Met Val 
30 35 40 45 



CAT CCT CCC TCA GGG TGG GAA GAG GTG AGT GGA TAC GAT GAG AAC ATG 
His Pro Pro Ser Gly Trp Glu Glu Val Ser Gly Tyr Asp Glu Asn Met 
50 55 60 



192 
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AAC ACC ATC CGC ACC TAC CAG GTG TGC AAC GTC TTT GAA TCC AGC CAA 240 
Asn Thr He Arg Thr Tyr Gin Val Cys Asn Val Phe Glu Ser Ser Gin 
65 70 75 

AAC AAC TGG CTG CGG ACC AAG TAC ATC CGG AGG CGA GGA GCG CAC CGC 288 
Asn Asn Trp Leu Arg Thr Lys Tyr He Arg Arg Arg Gly Ala His Arg 
80 85 90 

ATC CAC GTG GAG ATG AAA TTC TCC GTT CGG GAC TGC AGC AGC ATC CCC 336 
He His Val Glu Met Lys Phe Ser Val Arg Asp Cys Ser Ser He Pro 
95 100 105 

AAC GTC CCG GGC TCC TGT AAG GAG ACT TTT AAC CTC TAT TAC TAC GAA 384 
Asn Val Pro Gly Ser Cys Lys Glu Thr Phe Asn Leu Tyr Tyr Tyr Glu 
110 115 120 125 

TCA GAC TTT GAC TCT GCC ACC AAG ACT TTT CCT AAC TGG ATG GAA AAC 432 
Ser Asp Phe Asp Ser Ala Thr Lys Thr Phe Pro Asn Trp Met Glu Asn 
130 135 140 

CCT TGG ATG AAG GTA GAT ACA ATT GCT GCC GAC GAG AGC TTC TCG CAG 480 
Pro Trp Met Lys Val Asp Thr He Ala Ala Asp Glu Ser Phe Ser Gin 
145 150 155 

GTG GAC CTT GGT GGG CGG GTG ATG AAG ATT AAC ACC GAG GTG CGC AGT 528 
Val Asp Leu Gly Gly Arg Val Met Lys He Asn Thr Glu Val Arg Ser 
160 165 170 

TTT GGG CCT GTC TCC AAA AAC GGT TTC TAC CTG GCC TTC CAG GAC TAC 576 
Phe Gly Pro Val Ser Lys Asn Gly Phe Tyr Leu Ala Phe Gin Asp Tyr 
175 180 185 

GGG GGC TGC ATG TCC TTG ATT GCA GTC CGT GTC TTT TAC CGC AAG TGT 624 
Gly Gly Cys Met Ser Leu He Ala Val Arg Val Phe Tyr Arg Lys Cys 
190 195 200 205 

CCC CGT GTG ATC CAG AAC GGG GCG GTC TTC CAG GAA ACC CTC TCG GGA 672 
Pro Arg Val He Gin Asn Gly Ala Val Phe Gin Glu Thr Leu Ser Gly 
210 215 220 

GCG GAG AGC ACA TCT CTG GTG GCA GCC CGG GGG ACG TGC ATC AGC AAT 720 
Ala Glu Ser Thr Ser Leu Val Ala Ala Arg Gly Thr Cys He Ser Asn 
225 230 235 

GCG GAG GAG GTG GAT GTG CCC ATC AAG CTG TAC TGC AAT GGG GAT GGC 768 
Ala Glu Glu Val Asp Val Pro He Lys Leu Tyr Cys Asn Gly Asp Gly 
240 245 250 

GAG TGG CTG GTG CCC ATC GGC CGC TGC ATG TGC AGG CCG GGC TAT GAG 816 
Glu Trp Leu Val Pro He Gly Arg Cys Met Cys Arg Pro Gly Tyr Glu 
255 260 265 

TCG GTG GAG AAT GGG ACC GTC TGC AGA GGC TGC CCA TCA GGG ACC TTC 864 
Ser Val Glu Asn Gly Thr Val Cys Arg Gly Cys Pro Ser Gly Thr Phe 
270 275 280 - 285 

AAG GCC AGC CAA GGA GAT GAA GGA TGT GTC CAT TGT CCA ATT AAC AGC 912 
Lys Ala Ser Gin Gly Asp Glu Gly Cys Val His Cys Pro He Asn Ser 
290 295 300 

CGG ACG ACT TCG GAA GGG GCC ACG AAC TGC GTG TGC CGA AAC GGA TAT 960 
Arg Thr Thr Ser Glu Gly Ala Thr Asn Cys Val Cys Arg Asn Gly Tyr 
305 310 315 



TAC CGG GCA GAT GCT GAC CCC GTC GAC ATG CCA TGC ACC ACC ATC CCA 
Tyr Arg Ala Asp Ala Asp Pro Val Asp Met Pro Cys Thr Thr He Pro 
320 325 330 
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TCT 6CC CCC CAG GCC GTG ATC TCC AGC GTG AAT GAA ACC TCC CTG ATG 1056 
Ser Ala Pro Gin Ala Val lie Ser Ser Val Asn Glu Thr Ser Leu Met 
335 340 345 

CTG GAG TGG ACC CCA CCA CGA GAC TCA GGG GGC CGG GAG GAT CTG GTA 1104 
Leu Glu Trp Thr Pro Pro Arg Asp Ser Gly Gly Arg Glu Asp Leu Val 
350 355 360 365 

TAC AAC ATC ATC TGC AAG AGC TGT GGG TCA GGC CGT GGG GCG TGC ACG 1152 
Tyr Asn lie lie Cys Lys Ser Cys Gly Ser Gly Arg Gly Ala Cys Thr 
370 375 380 

CGC TGT GGG GAC AAC GTG CAG TTT GCC CCA CGC CAG CTG GGC CTG ACG 1200 
Arg Cys Gly Asp Asn Val Gin Phe Ala Pro Arg Gin Leu Gly Leu Thr 
385 390 395 

GAG CCT CGC ATC TAC ATC AGC GAC CTG CTG GCC CAC ACG CAG TAC ACC 1248 
Glu Pro Arg lie Tyr lie Ser Asp Leu Leu Ala His Thr Gin Tyr Thr 
400 405 410 

TTT GAG ATC CAG GCT GTG AAT GGG GTC ACC GAC CAG AGC CCC TTC TCC 1296 
Phe Glu He Gin Ala Val Asn Gly Val Thr Asp Gin Ser Pro Phe Ser 
415 420 425 

CCA CAG TTT GCA TCA GTG AAT ATC ACC ACC AAC CAG GCT GCT CCT TCA 1344 
Pro Gin Phe Ala Ser Val Asn He Thr Thr Asn Gin Ala Ala Pro Ser 
430 435 440 445 

GCC GTG TCC ATA ATG CAC CAG GTC AGC CGC ACT GTG GAC AGC ATT ACC 1392 
Ala Val Ser He Met His Gin Val Ser Arg Thr Val Asp Ser He Thr 
450 455 460 

CTC TCG TGG TCT CAA CCT GAC CAG CCC AAT GGA GTC ATC CTG GAT TAT 1440 
Leu Ser Trp Ser Gin Pro Asp Gin Pro Asn Gly Val He Leu Asp Tyr 
465 470 475 

GAG CTG CAA TAC TAT GAG AAG AAC CTG AGT GAG TTA AAT TCA ACA GCA 1488 
Glu Leu Gin Tyr Tyr Glu Lys Asn Leu Ser Glu Leu Asn Ser Thr Ala 
480 485 490 

GTG AAG AGC CCC ACC AAC ACT GTG ACA GTG CAA AAC CTC AAA GCT GGC 1536 
Val Lys Ser Pro Thr Asn Thr Val Thr Val Gin Asn Leu Lys Ala Gly 
495 500 505 

ACC ATC TAT GTC TTC CAA GTG CGA GCA CGT ACC GTG GCT GGG TAT GGC 1584 
Thr He Tyr Val Phe Gin Val Arg Ala Arg Thr Val Ala Gly Tyr Gly 
510 515 520 525 

CGG TAT AGT GGC AAG ATG TAC TTC CAG ACC ATG ACT GAA GCC GAG TAC 1632 
Arg Tyr Ser Gly Lys Met Tyr Phe Gin Thr Met Thr Glu Ala Glu Tyr 
530 535 540 

CAG ACC AGT GTC CAG GAG AAG CTG CCA CTC ATC ATT GGC TCC TCT GCA 1680 
Gin Thr Ser Val Gin Glu Lys Leu Pro Leu He He Gly Ser Ser Ala 
545 550 555 

GCA GGA CTG GTG TTT CTC ATT GCT GTT GTC GTC ATC ATT ATT GTC TGC 1728 
Ala Gly Leu Val Phe Leu He Ala Val Val Val He He He Val Cys 
560 565 570 

AAC AGA AGA CGG GGC TTT GAA CGT GCT GAC TCT GAG TAC ACT GAC AAG 1776 
Asn Arg Arg Arg Gly Phe Glu Arg Ala Asp Ser Glu Tyr Thr Asp Lys 
575 580 585 

CTG CAG CAC TAT ACC AGT GGC CAC ATG ACT CCA GGG ATG AAG ATT TAT 1824 
Leu Gin His Tyr Thr Ser Gly His Met Thr Pro Gly Met Lys He Tyr 
590 595 600 605 



WO 95/15375 PCT/US94/10140 



95 

ATC GAT CCA TTT ACC TAC GAA GAT CCC AAT GAG GCT GTC AGG GAA TTT 1872 
lie Asp Pro Phe Thr Tyr Glu Asp Pro Asn Glu Ala Val Arg Glu Phe 
610 615 620 

GCA AAA GAA ATT GAT ATC TCC TGT GTG AAA ATC GAG CAG GTG ATT GGG 1920 
Ala Lys Glu lie Asp lie Ser Cys Val Lys lie Glu Gin Val lie Gly 
625 630 635 

GCA GGG GAG TTT GGT GAG GTG TGC AGT GGG CAT CTC AAG CTT CCT GGC 1968 
Ala Gly Glu Phe Gly Glu Val Cys Ser Gly His Leu Lys Leu Pro Gly 
640 645 650 

AAA AGA GAG ATC TTT GTG GCC ATC AAG ACC CTG AAG TCT GGT TAC ACA 2016 
Lys Arg Glu He Phe Val Ala He Lys Thr Leu Lys Ser Gly Tyr Thr 
655 660 665 

GAG AAG CAG AGA CGG GAC TTC CTG AGT GAA GCC AGC ATC ATG GGG CAG 2064 
Glu Lys Gin Arg Arg Asp Phe Leu Ser Glu Ala Ser He Met Gly Gin 
670 675 680 685 

TTT GAC CAC CCC AAT GTC ATC CAC CTG GAA GGG GTG GTG ACC AAG AGT 2112 
Phe Asp His Pro Asn Val lie His Leu Glu Gly Val Val Thr Lys Ser 
690 695 700 

TCC CCA GTC ATG ATC ATT ACA GAG TTC ATG GAG AAT GGC TCG TTG GAC 2160 
Ser Pro Val Met He He Thr Glu Phe Met Glu Asn Gly Ser Leu Asp 
705 710 715 

TCC TTC TTG AGG CAA AAT GAT GGG CAG TTC ACA GTG ATC CAG CTG GTG 2208 
Ser Phe Leu Arg Gin Asn Asp Gly Gin Phe Thr Val He Gin Leu Val 
720 725 730 

GGC ATG TTG CGT GGC ATT GCA GCA GGC ATG AAG TAC CTG GCT GAT ATG 2256 
Gly Met Leu Arg Gly He Ala Ala Gly Met Lys Tyr Leu Ala Asp Met 
735 740 745 

AAC TAC GTG CAC CGG GAC CTG GCT GCC CGC AAC ATC CTG GTC AAC AGC 2304 
Asn Tyr Val His Arg Asp Leu Ala Ala Arg Asn He Leu Val Asn Ser 
750 755 760 765 

AAC CTG GTC TGC AAG GTG TCC GAC TTC GGC CTC TCC CGT TTC CTG GAG 2352 
Asn Leu Val Cys Lys Val Ser Asp Phe Gly Leu Ser Arg Phe Leu Glu 
770 775 780 

GAT GAC ACC TCT GAT CCC ACT TAC ACC AGC GCA CTG GGT GGA AAG ATC 2400 
Asp Asp Thr Ser Asp Pro Thr Tyr Thr Ser Ala Leu Gly Gly Lys He 
785 790 795 

CCA ATA CGG TGG ACA GCG CCT GAG GCA ATT CAG TAC CGA AAA TTC ACA 2448 
Pro He Arg Trp Thr Ala Pro Glu Ala lie Gin Tyr Arg Lys Phe Thr 
800 805 810 

TCA GCC AGC GAT GTG TGG AGC TAT GGA ATA GTC ATG TGG GAG GTG ATG 2496 
Ser Ala Ser Asp Val Trp Ser Tyr Gly I^e Val Met Trp Glu Val Met 
815 820 825 

TCG TAC GGC GAG CGG CCT TAC TGG GAC ATG ACC AAT CAA GAT GTG ATA 2544 
Ser Tyr Gly Glu Arg Pro Tyr Trp Asp Met Thr Asn Gin Asp Val He 
830 835 840 845 

AAT GCT ATT GAG CAG GAC TAT CGG CTA CCA CCC CCT ATG GAT TGT CCA 2592 
Asn Ala He Glu Gin Asp Tyr Arg Leu Pro Pro Pro Met Asp Cys Pro 
850 855 860 



AAT GCC CTG CAC CAG CTA ATG CTT GAC TGC TGG CAG AAG GAT CGA AAC 
Asn Ala Leu His Gin Leu Met Leu Asp Cys Trp Gin Lys Asp Arg Asn 
865 870 * 875 
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CAC AGA CCC AAA TTT GGA CAG ATT GTC AAC ACT TTA GAC AAA ATG ATC 2688 
His Arg Pro Lys Phe Gly Gin He Val Asn Thr Leu Asp Lys Met He 
880 885 890 

CGA AAT CCT AAT AGT CTG AAA GCC ATG GCA CCT CTC TCC TCT GGG GTT 2736 
Arg Asn Pro Asn Ser Leu Lys Ala Met Ala Pro Leu Ser Ser Gly Val 
895 900 905 

AAC CTC CCT CTA CTT GAC CGC ACA ATC CCA GAT TAT ACC AGC TTC AAC 2784 
Asn Leu Pro Leu Leu Asp Arg Thr He Pro Asp Tyr Thr Ser Phe Asn 
910 915 " 920 925 

ACT GTG GAT GAA TGG CTG GAT GCC ATC AAG ATG AGC CAG TAC AAG GAG 2832 
Thr Val Asp Glu Trp Leu Asp Ala He Lys Met Ser Gin Tyr Lys Glu 
930 935 940 

AGC TTT GCC AGT GCT GGC TTC ACC ACC TTT GAT ATA GTA TCT CAG ATG 2880 
Ser Phe Ala Ser Ala Gly Phe Thr Thr Phe Asp He Val Ser Gin Met 
945 950 955 

ACT GTA GAG GAC ATT CTA CGA GTT GGG GTC ACT TTA GCA GGA CAC CAG 2928 
Thr Val Glu Asp He Leu Arg Val Gly Val Thr Leu Ala Gly His Gin. 
960 965 970 

AAG AAA ATT CTG AAC AGT ATC CAG GTG ATG AGA GCA CAG ATG AAC CAA 2976 
Lys Lys He Leu Asn Ser He Gin Val Met Arg Ala Gin Met Asn Gin 
975 980 985 

ATT CAG TCT GTG GAG GTT TGATAGCAAC ACGTCCTCGT GCTCCACTTC 3024 
He Gin Ser Val Glu Val 
990 995 

CTTGAGGCCC TGCTCCCCTC TGCCCCTGTG TGTCTGAGCT CCAGTTCTTG AGTGTTCTGC 3084 

GTGGATCAGA GACAGGCAGC TGCTCTGAGG ATCATGGCAA CAGGAAGAAA TGCCCTATCA 3144 

TTGACAACGA GAAGTCATCA AGAGGTGAAA CAATGGAAAA CAATGGAAAA AGGGAACAAG 3204 

TAAAGACAGC TATTTTGAAA ACCGAAAACA AACAGTGAAT TATTTTTAAA TAATAATAAA 3264 

GCAATTGCAG TCTTGAAAAG GGCTCCAAGA CCAATGGGAG TCTCCAAAGG AAGAGAATAG 3324 

AG CAG CTT CA TCTATTTCCT CTTACACAAG GGTTGCTGCA GCTGGG CCCA GACACTTCTG 3384 

GAGTAACGAG ACTTTTCAAG AAGATGAATG CAAAGAATGG TCACAAGAAG CACTTCTCTT 3444 

TCTCACATGG GATGGCAGCT CTGGGAATGC CCGGCAGTCC TTCCTGAAAG CCCTGTTGGC 3504 

AAATCGAAGA GGAGAGCCGA AGCTCTTTGG TGCTGTGGAA CCAAGTGCAT CTCAGAAATT 3564 

GTTGGACTTC TACAAAAGCT GAAGACATTC TTTTTTTTTA AACAAGTAAA CTGATACTAG 3624 

AAGAGGCTGT TTCCGTCAAA TGAGAAGGAA TCTGTAACAC TGGCCCGGGG GGGGTGGGGA 3684 

ATGGGGGAAA TCAGTCCTTT TTACATCTCT TTATTTTCTC TTGTCATGGA ACAGTTTTGT 3744 

GAGTGACAGT TTCCTAAGGG TCCGTCCATC CACCCTCCAA TGGCATCATT GTTTCATACA 3804 

TATCATATGC ACAAGACTTA TAGTGATGTC CTCACTCGAT GCCAATGATC TTTCCCCAGA 3864 

AGACTTCCCA AGTACAGTAT GTAGTAGATT TTGATTACAA ATGCTGACGT GTACCTTTAT 3924 

TTTTCGGTTG TCGTTGTTGG GAGATTCGTC CTTTTACCTT GCTTTGTTAA CACCAATTTG 3984 

TGAGTTTGGG GTTGGAATTT TTTTGGTCGA TTGGGGTTGT TTTTTTTTTT TTTTTTTTTT 4044 

AACCG 4049 
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(2) INFORMATION FOR SEQ ID NO: 18: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 995 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: 

Met Pro Gly Pro Glu Arg Thr Met Gly Pro Leu Trp Phe Cys Cys Leu 
1 5 10 15 

Pro Leu Ala Leu Leu Pro Leu Leu Ala Ala Val Glu Glu Thr Leu Met 
20 25 30 

Asp Ser Thr Thr Ala Thr Ala Glu Leu Gly Trp Met Val His Pro Pro 
35 40 45 

Ser Gly Trp Glu Glu Val Ser Gly Tyr Asp Glu Asn Met Asn Thr lie 
50 55 60 

Arg Thr Tyr Gin Val Cys Asn Val Phe Glu Ser Ser Gin Asn Asn Trp 
65 70 75 80 

Leu Arg Thr Lys Tyr He Arg Arg Arg Gly Ala His Arg He His Val 
85 90 95 

Glu Met Lys Phe Ser Val Arg Asp Cys Ser Ser He Pro Asn Val Pro 
100 105 110 

Gly Ser Cys Lys Glu Thr Phe Asn Leu Tyr Tyr Tyr Glu Ser Asp Phe 
115 120 125 

Asp Ser Ala Thr Lys Thr Phe Pro Asn Trp Met Glu Asn Pro Trp Met 
130 135 140 

Lys Val Asp Thr He Ala Ala Asp Glu Ser Phe Ser Gin Val Asp Leu 
145 150 155 160 

Gly Gly Arg Val Met Lys He Asn Thr Glu Val Arg Ser Phe Gly Pro 
165 170 175 

Val Ser Lys Asn Gly Phe Tyr Leu Ala Phe Gin Asp Tyr Gly Gly Cys 
180 " 185 190 

Met Ser Leu He Ala Val Arg Val Phe Tyr Arg Lys Cys Pro Arg Val 
195 200 205 

He Gin Asn Gly Ala Val Phe Gin Glu Thr Leu Ser Gly Ala Glu Ser 
210 215 220 

Thr Ser Leu Val Ala Ala Arg Gly Thr Cys He Ser Asn Ala Glu Glu 
225 230 235 240 

Val Asp Val Pro He Lys Leu Tyr Cys Asn Gly Asp Gly Glu Trp Leu 
245 250 255 

Val Pro He Gly Arg Cys Met Cys Arg Pro Gly Tyr Glu Ser Val Glu 
260 265 270 

Asn Gly Thr Val Cys Arg Gly Cys Pro Ser Gly Thr Phe Lys Ala Ser 
275 280 285 
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Gin Gly Asp Glu Gly Cys Val His Cys Pro He Asn Ser Arg Thr Thr 
290 295 300 

Ser Glu Gly Ala Thr Asn Cys Val Cys Arg Asn Gly Tyr Tyr Arg Ala 
305 310 315 320 

Asp Ala Asp Pro Val Asp Met Pro Cys Thr Thr He Pro Ser Ala Pro 
325 330 335 

Gin Ala Val He Ser Ser Val Asn Glu Thr Ser Leu Met Leu Glu Trp 
340 345 350 

Thr Pro Pro Arg Asp Ser Gly Gly Arg Glu Asp Leu Val Tyr Asn He 
355 360 365 

He Cys Lys Ser Cys Gly Ser Gly Arg Gly Ala Cys Thr Arg Cys Gly 
370 375 380 

Asp Asn Val Gin Phe Ala Pro Arg Gin Leu Gly Leu Thr Glu Pro Arg 
385 390 395 400 

He Tyr He Ser Asp Leu Leu Ala His Thr Gin Tyr Thr Phe Glu He 
405 410 415 

Gin Ala Val Asn Gly Val Thr Asp Gin Ser Pro Phe Ser Pro Gin Phe 
420 425 430 

Ala Ser Val Asn He Thr Thr Asn Gin Ala Ala Pro Ser Ala Val Ser 
435 440 445 

He Met His Gin Val Ser Arg Thr Val Asp Ser He Thr Leu Ser Trp 
450 455 460 

Ser Gin Pro Asp Gin Pro Asn Gly Val He Leu Asp Tyr Glu Leu Gin 
465 470 475 480 

Tyr Tyr Glu Lys Asn Leu Ser Glu Leu Asn Ser Thr Ala Val Lys Ser 
485 490 495 

Pro Thr Asn Thr Val Thr Val Gin Asn Leu Lys Ala Gly Thr He Tyr 
500 505 * 510 

Val Phe Gin Val Arg Ala Arg Thr Val Ala Gly Tyr Gly Arg Tyr Ser 
515 520 525 

Gly Lys Met Tyr Phe Gin Thr Met Thr Glu Ala Glu Tyr Gin Thr Ser 
530 535 540 

Val Gin Glu Lys Leu Pro Leu He He Gly Ser Ser Ala Ala Gly Leu 
545 550 555 560 

Val Phe Leu He Ala Val Val Val He He He Val Cys Asn Arg Arg 
565 570 575 

Arg Gly Phe Glu Arg Ala Asp Ser Glu Tyr Thr Asp Lys Leu Gin His 
580 585 590 

Tyr Thr Ser Gly His Met Thr Pro Gly Met Lys He Tyr He Asp Pro 
595 600 605 

Phe Thr Tyr Glu Asp Pro Asn Glu Ala Val Arg Glu Phe Ala Lys Glu 
610 615 620 

He Asp He Ser Cys Val Lys He Glu Gin Val He Gly Ala Gly Glu 



625 



635 



640 
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Phe Gly Glu Val Cys Ser Gly His Leu Lys Leu Pro Gly Lys Arg Glu 
645 650 655 

lie Phe Val Ala lie Lys Thr Leu Lys Ser Gly Tyr Thr Glu Lys Gin 
660 665 670 

Arg Arg Asp Phe Leu Ser Glu Ala Ser lie Met Gly Gin Phe Asp His 
675 680 685 

Pro Asn Val He His Leu Glu Gly Val Val Thr Lys Ser Ser Pro Val 
690 695 700 

Met He He Thr Glu Phe Met Glu Asn Gly Ser Leu Asp Ser Phe Leu 
705 710 715 720 

Arg Gin Asn Asp Gly Gin Phe Thr Val He Gin Leu Val Gly Met Leu 
725 730 735 

Arg Gly lie Ala Ala Gly Met Lys Tyr Leu Ala Asp Met Asn Tyr Val 
740 745 750 

His Arg Asp Leu Ala Ala Arg Asn He Leu Val Asn Ser Asn Leu Val 
755 760 765 

Cys Lys Val Ser Asp Phe Gly Leu Ser Arg Phe Leu Glu Asp Asp Thr 
770 775 780 

Ser Asp Pro Thr Tyr Thr Ser Ala Leu Gly Gly Lys He Pro He Arg 
785 790 795 800 

Trp Thr Ala Pro Glu Ala He Gin Tyr Arg Lys Phe Thr Ser Ala Ser 
805 810 815 

Asp Val Trp Ser Tyr Gly He Val Met Trp Glu Val Met Ser Tyr Gly 
820 825 830 

Glu Arg Pro Tyr Trp Asp Met Thr Asn Gin Asp Val He Asn Ala He 
835 840 845 

Glu Gin Asp Tyr Arg Leu Pro Pro Pro Met Asp Cys Pro Asn Ala Leu 
850 855 860 

His Gin Leu Met Leu Asp Cys Trp Gin Lys Asp Arg Asn His Arg Pro 
865 870 875 880 

Lys Phe Gly Gin He Val Asn Thr Leu Asp Lys Met He Arg Asn Pro 
885 890 895 

Asn Ser Leu Lys Ala Met Ala Pro Leu Ser Ser Gly Val Asn Leu Pro 
900 90S 9io 

Leu Leu Asp Arg Thr He Pro Asp Tyr Thr Ser Phe Asn Thr Val Asp 
915 920 925 

Glu Trp Leu Asp Ala He Lys Met Ser Gin Tyr Lys Glu Ser Phe Ala 
930 935 940 

Ser Ala Gly Phe Thr Thr Phe Asp He Val Ser Gin Met Thr Val Glu 
945 950 955 960 

Asp He Leu Arg Val Gly Val Thr Leu Ala Gly His Gin Lys Lys He 
965 970 975 

Leu Asn Ser He Gin Val Met Arg Ala Gin Met Asn Gin He Gin Ser 
980 985 990 
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Val Glu Val 
995 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 3125 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : both 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 2.. 2233 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

C CTC AAA TTC ACC CTG AGG GAC TGT AAC AGC CTT CCA GGA GGA CTT 46 
Leu Lys Phe Thr Leu Arg Asp Cys Asn Ser Leu Pro Gly Gly Leu 
1 5 10 15 

GGG ACT TGC AAG GAG ACT TTT AAC ATG TAC TAC TTT GAG TCA GAT GAT 94 
Gly Thr Cys Lys Glu Thr Phe Asn Met Tyr Tyr Phe Glu Ser Asp Asp 
20 25 30 

GAA GAT GGG AGG AAC ATC AGA GAG AAT CAG TAC ATC AAG ATA GAT ACC 142 
Glu Asp Gly Arg Asn lie Arg Glu Asn Gin Tyr He Lys He Asp Thr 
35 40 45 

ATT GCT GCT GAT GAG AGC TTC ACG GAG TTG GAC CTC GGC GAC AGA GTT 190 
He Ala Ala Asp Glu Ser Phe Thr Glu Leu Asp Leu Gly Asp Arg Val 
50 55 60 

ATG AAG TTA AAC ACA GAA GTG AGA GAT GTT GGG CCT CTA ACA AAA AAA 238 
Met Lys Leu Asn Thr Glu Val Arg Asp Val Gly Pro Leu Thr Lys Lys 
65 70 75 

GGA TTT TAC CTT GCT TTC CAG GAT GTG GGC GCC TGC ATT GCC CTG GTC 286 
Gly Phe Tyr Leu Ala Phe Gin Asp Val Gly Ala Cys He Ala Leu Val 
80 85 90 95 

TCT GTG CGT GTG TAC TAC AAG AAA TGC CCA TCA GTG ATC CGC AAC CTG 334 
Ser Val Arg Val Tyr Tyr Lys Lys Cys Pro Ser Val He Arg Asn Leu 
100 105 110 

GCA CGC TTT CCA GAT ACC ATC ACA GGA GCA GAT TCC TCG CAG CTG CTA 382 
Ala Arg Phe Pro Asp Thr He Thr Gly Ala Asp Ser Ser Gin Leu Leu 
115 120 125 

GAA GTG TCA GGC GTC TGT GTC AAC CAC TCA GTG ACT GAT GAG GCA CCA 430 
Glu Val Ser Gly Val Cys Val Asn His Ser Val Thr Asp Glu Ala Pro 
130 135 140 

AAG ATG CAC TGC AGT TCA GAG GGA GAA TGG CTG GTG CCC ATT GGG AAG 478 
Lys Met His Cys Ser Ser Glu Gly Glu Trp Leu Val Pro He Gly Lys 
145 150 155 

TGT TTG TGC AAG GCA GGG TAC GAG GAG AAG AAC AAC ACC TGC CAA GCA 526 
Cys Leu Cys Lys Ala Gly Tyr Glu Glu Lys Asn Asn Thr Cys Gin Ala 
160 165 170 175 

CCT TCT CCA GTC AGT AGT GTG AAA AAA GGG AAG ATA ACT AAA AAT AGC 574 
Pro Ser Pro Val Ser Ser Val Lys Lys Gly Lys He Thr Lys Asn Ser 
180 185 190 
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ATC TCC CTT TCC TGG CAG GAG CCA GAT CGA CCC AAC GGC ATC ATC CTG 622 
lie Ser Leu Ser Trp Gin Glu Pro Asp Arg Pro Asn Gly lie lie Leu 
195 200 205 

GAA TAC GAA ATC AAA TAT TTT GAA AAG GAC CAG GAG ACA AGC TAC ACC 670 
Glu Tyr Glu He Lys Tyr Phe Glu Lys Asp Gin Glu Thr Ser Tyr Thr 
210 215 220 

ATC ATC AAA TCC AAA GAG ACC GCA ATT ACG GCA GAT GGC TTG AAA CCA 718 
He He Lys Ser Lys Glu Thr Ala He Thr Ala Asp Gly Leu Lys Pro 
225 230 235 

GGC TCA GCG TAC GTC TTC CAG ATC CGA GCC CGG ACA GCT GCT GGC TAC 766 
Gly Ser Ala Tyr Val Phe Gin He Arg Ala Arg Thr Ala Ala Gly Tyr 
240 245 250 255 

GGT GGC TTC AGT CGA AGA TTT GAG TTT GAA ACC AGC CCA GTG TTA GCT 814 
Gly Gly Phe Ser Arg Arg Phe Glu Phe Glu Thr Ser Pro Val Leu Ala 
260 265 270 

GCA TCC AGT GAC CAG AGC CAG ATT CCT ATA ATT GTT GTG TCT GTA ACA 862 
Ala Ser Ser Asp Gin Ser Gin He Pro He He Val Val Ser Val Thr 
275 280 285 

GTG GGA GTT ATT CTG CTG GCT GTT GTT ATC GGT TTC CTT CTC AGT GGA 910 
Val Gly Val He Leu Leu Ala Val Val He Gly Phe Leu Leu Ser Gly 
290 295 300 

AGT TGC TGC GAT CAT GGC TGT GGG TGG GCT TCT TCT CTG CGT GCT GTT 958 
Ser Cys Cys Asp His Gly Cys Gly Trp Ala Ser Ser Leu Arg Ala Val 
305 310 315 

GCC TAT CCG AGC CTA ATA TGG CGC TGT GGC TAC AGC AAG GCT AAA CAA 1006 
Ala Tyr Pro Ser Leu He Trp Arg Cys Gly Tyr Ser Lys Ala Lys Gin 
320 325 330 335 

GAC CCA GAA GAA GAA AAG ATG CAT TTT CAT AAT GGC CAC ATT AAA CTG 1054 
Asp Pro Glu Glu Glu Lys Met His Phe His Asn Gly His He Lys Leu 
340 345 350 

CCT GGT GTA AGA ACC TAC ATT GAT CCC CAC ACC TAT GAG GAC CCT AAT 1102 
Pro Gly Val Arg Thr Tyr He Asp Pro His Thr Tyr Glu Asp Pro Asn 
355 360 365 

CAA GCT GTC CAC GAG TTT GCC AAG GAA ATA GAA GCT TCG TGC ATA ACC 1150 
Gin Ala Val His Glu Phe Ala Lys Glu He Glu Ala Ser Cys He Thr 
370 375 380 

ATC GAG AGA GTT ATC GGA GCT GGT GAA TTT GGA GAA GTC TGC AGT GGA 1198 
He Glu Arg Val He Gly Ala Gly Glu Phe Gly Glu Val Cys Ser Gly 
385 390 395 

CGG CTG AAA CTG CAG GGA AAA CGC GAG TTT CCA GTG GCT ATC AAA ACC 1246 
Arg Leu Lys Leu Gin Gly Lys Arg Glu Phe Pro Val Ala He Lys Thr 
400 405 410 415 

CTG AAG GTG GGC TAC ACA GAG AAG CAA AGG CGA GAT TTC CTG GGA GAA 1294 
Leu Lys Val Gly Tyr Thr Glu Lys Gin Arg Arg Asp Phe Leu Gly Glu 
420 425 430 

GCG AGC ATC ATG GGG CAG TTC GAC CAC CCC AAC ATC ATC CAC CTG GAA 1342 
Ala Ser He Met Gly Gin Phe Asp His Pro Asn He He His Leu Glu 
435 440 445 

GGT GTC GTC ACA AAA AGC AAA CCT GTA ATG ATA GTA ACG GAA TAC ATG 1390 
Gly Val Val Thr Lys Ser Lys Pro Val Met He Val Thr Glu Tyr Met 
450 455 460 
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GAA AAT GGT TCT CTG GAT ACA TTT TTA AAG AAG AAC GAT GGG CAG TTC 1438 
Glu Asn Gly Ser Leu Asp Thr Phe Leu Lys Lys Asn Asp Gly Gin Phe 
465 470 475 

ACG GTC ATT CAG CTG GTC GGG ATG CTG CGA GGC ATC GCA TCA GGG ATG 1486 
Thr Val He Gin Leu Val Gly Met Leu Arg Gly He Ala Ser Gly Met 
480 485 490 495 

AAG TAC CTG TCT GAC ATG GGT TAC GTA CAC AGA GAC CTC GCT GCC AGG 1534 
Lys Tyr Leu Ser Asp Met Gly Tyr Val His Arg Asp Leu Ala Ala Arg 
500 505 510 

AAT ATC CTC ATC AAC AGC AAC TTA GTC TGC AAG GTG TCT GAC TTT GGC 1582 
Asn He Leu He Asn Ser Asn Leu Val Cys Lys Val Ser Asp Phe Gly 
515 520 525 

CTC TCC AGA GTC CTA GAA GAT GAT CCT GAA GCA GCG TAC ACA ACC AGG 1630 
Leu Ser Arg Val Leu Glu Asp Asp Pro Glu Ala Ala Tyr Thr Thr Arg 
530 535 540 

GGA GGG AAG ATC CCC ATC CGA TGG ACG GCA CCT GAA GCA ATC GCC TTC 1678 
Gly Gly Lys He Pro He Arg Trp Thr Ala Pro Glu Ala He Ala Phe 
545 550 555 

CGC AAA TTC ACG TCG GCC AGC GAT GTG TGG AGC TAC GGC ATT GTG ATG 1726 
Arg Lys Phe Thr Ser Ala Ser Asp Val Trp Ser Tyr Gly He Val Met 
560 565 570 575 

TGG GAA GTG ATG TCC TAT GGC GAG AGA CCT TAC TGG GAA ATG ACA AAC 1774 
Trp Glu Val Met Ser Tyr Gly Glu Arg Pro Tyr Trp Glu Met Thr Asn 
580 585 590 

CAA GAT GTG ATT AAA GCC GTG GAG GAA GGC TAT CGC CTG CCA AGT CCC 1822 
Gin Asp Val He Lys Ala Val Glu Glu Gly Tyr Arg Leu Pro Ser Pro 
595 600 605 

ATG GAC TGC CCT GCT GCT CTC TAC CAG TTG ATG CTT GAC TGC. TGG CAG 1870 
Met Asp Cys Pro Ala Ala Leu Tyr Gin Leu Met Leu Asp Cys Trp Gin 
610 615 620 

AAA GAC CGC AAC AGC AGG CCC AAG TTT GAT GAA ATT GTC AGC ATG TTG 1918 
Lys Asp Arg Asn Ser Arg Pro Lys Phe Asp Glu He Val Ser Met Leu 
625 630 635 

GAC AAG CTC ATC CGT AAC CCA AGC AGC TTG AAG ACG TTG GTT AAT- GCA 1966 
Asp Lys Leu He Arg Asn Pro Ser Ser Leu Lys Thr Leu Val Asn Ala 
640 645 650 655 

TCG AGC AGA GTA TCA AAT TTG TTG GTA GAA CAC AGT CCA GTG GGG AGC 2014 
Ser Ser Arg Val Ser Asn Leu Leu Val Glu His Ser Pro Val Gly Ser 
660 665 670 

GGT GCC TAC AGG TCA GTG GGT GAG TGG CTG GAA GCC ATC AAA ATG GGT 2062 
Gly Ala Tyr Arg Ser Val Gly Glu Trp Leu Glu Ala He Lys Met Gly 
675 680 685 

CGA TAC ACC GAG ATT TTC ATG GAG AAT GGA TAC AGT TCG ATG GAT TCT 2110 
Arg Tyr Thr Glu He Phe Met Glu Asn Gly Tyr Ser Ser Met Asp Ser 
690 695 " 700 

GTG GCT CAG GTG ACC CTA GAG GAT TTG AGG CGG CTG GGA GTG ACA CTT 2158 
Val Ala Gin Val Thr Leu Glu Asp Leu Arg Arg Leu Gly Val Thr Leu 
705 710 715 

GTT GGT CAC CAG AAG AAG ATA ATG AAC AGC CTT CAA GAG ATG AAG GTC 2206 
Val Gly His Gin Lys Lys He Met Asn Ser Leu Gin Glu Met Lys Val 
720 725 730 735 
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CAG TTG GTG AAT GGG ATG GTG CCA TTG TAACTCGGTT TTTAAGTCAC 2253 
Gin Leu Val Asn Gly Met Val Pro Leu 
740 

TTCCTCGAGT GGTCGGTCCT GCACTTTGTA TACTAGCTCT GAGATTTATT TTGACTAAAG 2313 

AAGAAAAAAG GGAAATTCAG TGGTTTCTGT AACTGAAGGA CGCTGGCTTC TGCCACAGCA 2373 

TTTATAAAGC AGTGTTTGAC TGAAGTTTTC ATTTTCTTCC TATTTGTGTC CTCATTCTCA 2433 

TGAAGTAAAT GTAACATGCA TGGAACATGG AAATGGATCT ACTGTACATG AGGTTACCCA 2493 

ATTTCTTGCG CTTCAGCATG ACAACAGCAA GCCTTCCCAC CACATGTTGT CTATACATGG 2553 

GAGATATATA TATATGCATA TATATATATA GCACCTTTAT ATACTGAATT ACAGCAGCAG 2613 

CACATGTTAA TACTTCCAAG GACTTACTTG ACTAGAGAAG TTTTGCAGCC ATTGTGGGCT 2673 

CACACAAGCT GCGGTTTACT GAAGTTTACT TCAAGTCTTA CTTGTCTACA GAAGTGTATT 2733 

GAAGAGCAAT ATGATTAGAT TATTTCTGGA TAGATATTTT GTTTTGTAAA TTTAAAAAAT 2793 

CGTGTTACAC AGCGTTAAGT TATAGAGACT AGTGTATAAA CATGTTGCTT GCTCAATGGC 2853 

AAATACAATA CAGGGTGTAT ATTTTTTTCT CTCTGTGTTG CAAAGTTCTT TTAGTTTGCT 2913 

CTTCTGTGAG GATAATACGT TATGATGTAT ATACTGTACA GTTTGCTACA CATCAGGTAC 2973 

AAGATTGGGG CTTTCTCAAT GTTTTGTTCT TTTTCCCTCT TTTGTTTCAT TTTGTCTTCC 3033 

TTTTGTGTTA ACCACTATGC TTTGTATTTT TGCTGCTGTT TGGTTTGAGG CAACATATAA 3093 

AGCTTTCAGG TGTTTTGATT ATAAAAAAAA AG 3125 

(2) INFORMATION FOR SEQ ID NO:20: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 744 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Leu Lys Phe Thr Leu Arg Asp Cys Asn Ser Leu Pro Gly Gly Leu Gly 
15 10 15 

Thr Cys Lys Glu Thr Phe Asn Met Tyr Tyr Phe Glu Ser Asp Asp Glu 
20 25 30 

Asp Gly Arg Asn lie Arg Glu Asn Gin Tyr He Lys He Asp Thr He 
35 40 45 

Ala Ala Asp Glu Ser Phe Thr Glu Leu Asp Leu Gly Asp Arg Val Met 
50 55 60 

Lys Leu Asn Thr Glu Val Arg Asp Val Gly Pro Leu Thr Lys Lys Gly 
65 70 75 80 

Phe Tyr Leu Ala Phe Gin Asp Val Gly Ala Cys He Ala Leu Val Ser 
85 90 " 95 

Val Arg Val Tyr Tyr Lys Lys Cys Pro Ser Val He Arg Asn Leu Ala 
100 105 110 
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Arg Phe Pro Asp Thr He Thr Gly Ala Asp Ser Ser Gin Leu Leu Glu 
115 120 125 

Val Ser Gly Val Cys Val Asn His Ser Val Thr Asp Glu Ala Pro Lys 
130 135 140 

Met His Cys Ser Ser Glu Gly Glu Trp Leu Val Pro He Gly Lys Cys 
145 150 155 160 

Leu Cys Lys Ala Gly Tyr Glu Glu Lys Asn Asn Thr Cys Gin Ala Pro 
165 170 175 

Ser Pro Val Ser Ser Val Lys Lys Gly Lys He Thr Lys Asn Ser He 
180 185 190 

Ser Leu Ser Trp Gin Glu Pro Asp Arg Pro Asn Gly He He Leu Glu 
195 200 205 

Tyr Glu He Lys Tyr Phe Glu Lys Asp Gin Glu Thr Ser Tyr Thr He 
210 215 220 

He Lys Ser Lys Glu Thr Ala He Thr Ala Asp Gly Leu Lys Pro Gly 
225 230 235 240 

Ser Ala Tyr Val Phe Gin He Arg Ala Arg Thr Ala Ala Gly Tyr Gly 
245 250 255 

Gly Phe Ser Arg Arg Phe Glu Phe Glu Thr Ser Pro Val Leu Ala Ala 
260 265 270 

Ser Ser Asp Gin Ser Gin He Pro He He Val Val Ser Val Thr Val 
275 280 285 

Gly Val He Leu Leu Ala Val Val He Gly Phe Leu Leu Ser Gly Ser 
290 295 300 

Cys Cys Asp His Gly Cys Gly Trp Ala Ser Ser Leu Arg Ala Val Ala 
305 310 315 320 

Tyr Pro Ser Leu He Trp Arg Cys Gly Tyr Ser Lys Ala Lys Gin Asp 
325 330 335 

Pro Glu Glu Glu Lys Met His Phe His Asn Gly His He Lys Leu Pro 
340 345 350 

Gly Val Arg Thr Tyr He Asp Pro His Thr Tyr Glu Asp Pro Asn Gin 
355 360 365 

Ala Val His Glu Phe Ala Lys Glu He Glu Ala Ser Cys He Thr He 
370 375 380 

Glu Arg Val He Gly Ala Gly Glu Phe Gly Glu Val Cys Ser Gly Arg 
385 390 . 395 400 

Leu Lys Leu Gin Gly Lys Arg Glu Phe Pro Val Ala He Lys Thr Leu 
405 410 415 

Lys Val Gly Tyr Thr Glu Lys Gin Arg Arg Asp Phe Leu Gly Glu Ala 
420 425 430 

Ser He Met Gly Gin Phe Asp His Pro Asn He He His Leu Glu Gly 



435 



440 



445 



Val 



Val 
450 



Thr 



Lys 



Ser 



Lys 



Pro Val Met He Val 
455 



Thr 
460 



Glu 



Tyr 



Met 



Glu 
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Asn Gly Ser Leu Asp Thr Phe Leu Lys Lys Asn Asp Gly Gin Phe Thr 
465 470 475 480 

Val lie Gin Leu Val Gly Met Leu Arg Gly He Ala Ser Gly Met Lys 
485 490 495 

Tyr Leu Ser Asp Met Gly Tyr Val His Arg Asp Leu Ala Ala Arg Asn 
500 505 510 

He Leu He Asn Ser Asn Leu Val Cys Lys Val Ser Asp Phe Gly Leu 
515 520 525 

Ser Arg Val Leu Glu Asp Asp Pro Glu Ala Ala Tyr Thr Thr Arg Gly 
530 535 540 

Gly Lys lie Pro He Arg Tip Thr Ala Pro Glu Ala He Ala Phe Arg 
545 550 555 560 

Lys Phe Thr Ser Ala Ser Asp Val Trp Ser Tyr Gly He Val Met Trp 
565 570 575 

Glu Val Met Ser Tyr Gly Glu Arg Pro Tyr Trp Glu Met Thr Asn Gin 
580 585 590 

Asp Val He Lys Ala Val Glu Glu Gly Tyr Arg Leu Pro Ser Pro Met 
595 600 605 

Asp Cys Pro Ala Ala Leu Tyr Gin Leu Met Leu Asp Cys Trp Gin Lys 
610 615 620 

Asp Arg Asn Ser Arg Pro Lys Phe Asp Glu He Val Ser Met Leu Asp 
625 630 635 640 

Lys Leu He Arg Asn Pro Ser Ser Leu Lys Thr Leu Val Asn Ala Ser 
645 650 655 

Ser Arg Val Ser Asn Leu Leu Val Glu His Ser Pro Val Gly Ser Gly 
660 665 670 

Ala Tyr Arg Ser Val Gly Glu Trp Leu Glu Ala He Lys Met Gly Arg 
675 680 685 

Tyr Thr Glu He Phe Met Glu Asn Gly Tyr Ser Ser Met Asp Ser Val 
690 695 * 700 

Ala Gin Val Thr Leu Glu Asp Leu Arg Arg Leu Gly Val Thr Leu Val 
705 710 715 720 

Gly His Gin Lys Lys He Met Asn Ser Leu Gin Glu Mejt Lys Val Gin 
725 730 735 

Leu Val Asn Gly Met Val Pro Leu 
740 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3056 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : both 

(D) TOPOLOGY: linear 
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(ix) FEATURE: < 

(A) NAME/KEY: CDS 

(B) LOCATION: 2.. 2131 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 21: 

C CTC AAA TTC ACC CTG AGG GAC TGT AAC AGC CTT CCA GGA GGA CTT 46 
Leu Lys Phe Thr Leu Arg Asp Cys Asn Ser Leu Pro Gly Gly Leu 
1 5 10 15 

GGG ACT TGC AAG GAG ACT TTT AAC ATG TAC TAC TTT GAG TCA GAT GAT 94 
Gly Thr Cys Lys Glu Thr Phe Asn Met Tyr Tyr Phe Glu Ser Asp Asp 
20 25 30 

GAA GAT GGG AGG AAC ATC AGA GAG AAT CAG TAC ATC AAG ATA GAT ACC 142 
Glu Asp Gly Arg Asn lie Arg Glu Asn Gin Tyr He Lys He Asp Thr 
35 40 45 

ATT GCT GCT GAT GAG AGC TTC ACG GAG TTG GAC CTC GGC GAC AGA GTT 190 
He Ala Ala Asp Glu Ser Phe Thr Glu Leu Asp Leu Gly Asp Arg Val 
50 55 60 

ATG AAG TTA AAC ACA GAA GTG AGA GAT GTT GGG CCT CTA ACA AAA AAA 238 
Met Lys Leu Asn Thr Glu Val Arg Asp Val Gly Pro Leu Thr Lys Lys 
65 70 75 

GGA TTT TAC CTT GCT TTC CAG GAT GTG GGC GCC TGC ATT GCC CTG GTC 286 
Gly Phe Tyr Leu Ala Phe Gin Asp Val Gly Ala Cys He Ala Leu Val 
80 85 90 95 

TCT GTG CGT GTG TAC TAC AAG AAA TGC CCA TCA GTG ATC CGC AAC CTG 334 
Ser Val Arg Val Tyr Tyr Lys Lys Cys Pro Ser Val He Arg Asn Leu 
100 105 110 

GCA CGC TTT CCA GAT ACC ATC ACA GGA GCA GAT TCC TCG CAG CTG CTA 382 
Ala Arg Phe Pro Asp Thr He Thr Gly Ala Asp Ser Ser Gin Leu Leu 
115 120 125 

GAA GTG TCA GGC GTC TGT GTC AAC CAC TCA GTG ACT GAT GAG GCA CCA 430 
Glu Val Ser Gly Val Cys Val Asn His Ser Val Thr Asp Glu Ala Pro 
130 135 140 

AAG ATG CAC TGC AGT TCA GAG GGA GAA TGG CTG GTG CCC ATT GGG AAG 478 
Lys Met His Cys Ser Ser Glu Gly Glu Trp Leu Val Pro He Gly Lys 
145 150 155 

TGT TTG TGC AAG GCA GGG TAC GAG GAG AAG AAC AAC ACC TGC CAA GCA 526 
Cys Leu Cys Lys Ala Gly Tyr Glu Glu Lys Asn Asn Thr Cys Gin Ala 
160 165 170 175 

CCT TCT CCA GTC AGT AGT GTG AAA AAA GGG AAG ATA ACT AAA AAT AGC 574 
Pro Ser Pro Val Ser Ser Val Lys Lys Gly Lys He Thr Lys Asn Ser 
180 185 190 

ATC TCC CTT TCC TGG CAG GAG CCA GAT CGA CCC AAC GGC ATC ATC CTG 622 
He Ser Leu Ser Trp Gin Glu Pro Asp Arg Pro Asn Gly He He Leu 
195 200 205 

GAA TAC GAA ATC AAA TAT TTT GAA AAG GAC CAG GAG ACA AGC TAC ACC 670 
Glu Tyr Glu He Lys Tyr Phe Glu Lys Asp Gin Glu Thr Ser Tyr Thr 
210 215 220 



ATC ATC AAA TCC AAA GAG ACC GCA ATT ACG GCA GAT GGC TTG AAA CCA 
He He Lys Ser Lys Glu Thr Ala He Thr Ala Asp Gly Leu Lys Pro 
225 230 235 



718 
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GGC TCA GCG TAC GTC TTC CAG ATC CGA GCC CGG ACA GCT GCT GGC TAC 766 
Gly Ser Ala Tyr Val Phe Gin lie Arg Ala Arg Thr Ala Ala Gly Tyr 
240 245 250 255 

GGT GGC TTC AGT CGA AGA TTT GAG TTT GAA ACC AGC CCA GTG TTA GCT 814 
Gly Gly Phe Ser Arg Arg Phe Glu Phe Glu Thr Ser Pro Val Leu Ala 
260 265 270 

GCA TCC AGT GAC CAG AGC CAG ATT CCT ATA ATT GTT GTG TCT GTA ACA 862 
Ala Ser Ser Asp Gin Ser Gin He Pro He He Val Val Ser Val Thr 
275 280 285 

GTG GGA GTT ATT CTG CTG GCT GTT GTT ATC GGT TTC CTT CTC AGT GGA 910 
Val Gly Val He Leu Leu Ala Val Val He Gly Phe Leu Leu Ser Gly 
290 295 300 . 

AGG CGC TGT GGC TAC AGC AAG GCT AAA CAA GAC CCA GAA GAA GAA AAG 958 
Arg Arg Cys Gly Tyr Ser Lys Ala Lys Gin Asp Pro Glu Glu Glu Lys 
305 310 315 

ATG CAT TTT CAT AAT GGC CAC ATT AAA CTG CCT GGT GTA AGA ACC TAC 1006 
Met His Phe His Asn Gly His He Lys Leu Pro Gly Val Arg Thr Tyr 
320 325 330 335 

ATT GAT CCC CAC ACC TAT GAG GAC CCT AAT CAA GCT GTC CAC GAG TTT 1054 
He Asp Pro His Thr Tyr Glu Asp Pro Asn Gin Ala Val His Glu Phe 
340 345 350 

GCC AAG GAA ATA GAA GCT TCG TGC ATA ACC ATC GAG AGA GTT ATC GGA 1102 
Ala Lys Glu He Glu Ala Ser Cys He Thr He Glu Arg Val He Gly 
355 360 365 

GCT GGT GAA TTT GGA GAA GTC TGC AGT GGA CGG CTG AAA CTG CAG GGA 1150 
Ala Gly Glu Phe Gly Glu Val Cys Ser Gly Arg Leu Lys Leu Gin Gly 
370 375 380 

AAA CGC GAG TTT CCA GTG GCT ATC AAA ACC CTG AAG GTG GGC TAC ACA 1198 
Lys Arg Glu Phe Pro Val Ala He Lys Thr Leu Lys Val Gly Tyr Thr 
385 390 395 

GAG AAG CAA AGG CGA GAT TTC CTG GGA GAA GCG AGC ATC ATG GGG CAG 1246 
Glu Lys Gin Arg Arg Asp Phe Leu Gly Glu Ala Ser He Met Gly Gin 
400 405 410 415 

TTC GAC CAC CCC AAC ATC ATC CAC CTG GAA GGT GTC GTC ACA AAA AGC 1294 
Phe Asp His Pro Asn He He His Leu Glu Gly Val Val Thr Lys Ser 
420 425 430 

AAA CCT GTA ATG ATA GTA ACG GAA TAC ATG GAA AAT GGT TCT CTG GAT 1342 
Lys Pro Val Met He Val Thr Glu Tyr Met Glu Asn Gly Ser Leu Asp 
435 440 445 

ACA TTT TTA AAG AAG AAC GAT GGG CAG TTC ACG GTC ATT CAG CTG GTC 1390 
Thr Phe Leu Lys Lys Asn Asp Gly Gin Phe Thr Val He Gin Leu Val 
450 455 460 

GGG ATG CTG CGA GGC ATC GCA TCA GGG ATG AAG TAC CTG TCT GAC ATG 1438 
Gly Met Leu Arg Gly He Ala Ser Gly Met Lys Tyr Leu Ser Asp Met 
465 ~ 470 475 

GGT TAC GTA CAC AGA GAC CTC GCT GCC AGG AAT ATC CTC ATC AAC AGC 1486 
Gly . Tyr Val His Arg Asp Leu Ala Ala Arg Asn He Leu He Asn Ser 
480 485 490 495 

AAC TTA GTC TGC AAG GTG TCT GAC TTT GGC CTC TCC AGA GTC CTA GAA 1534 
Asn Leu Val Cys Lys Val Ser Asp Phe Gly Leu Ser Arg Val Leu Glu 
500 505 510 
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GAT GAT CCT GAA GCA GCG TAC ACA ACC AGG GGA GGG AAG ATC CCC ATC 1582 
Asp Asp Pro Glu Ala Ala Tyr Thr Thr Arg Gly Gly Lys He Pro He 
515 520 525 

CGA TGG ACG GCA CCT GAA GCA ATC GCC TTC CGC AAA TTC ACG TCG GCC 1630 
Arg Trp Thr Ala Pro Glu Ala He Ala Phe Arg Lys Phe Thr Ser Ala 
530 535 540 

AGC GAT GTG TGG AGC TAC GGC ATT GTG ATG TGG GAA GTG ATG TCC TAT 1678 
Ser Asp Val Trp Ser Tyr Gly He Val Met Trp Glu Val Met Ser Tyr 
545 550 555 

GGC GAG AGA CCT TAC TGG GAA ATG ACA AAC CAA GAT GTG ATT AAA GCC 1726 
Gly Glu Arg Pro Tyr Trp Glu Met Thr Asn Gin Asp Val He Lys Ala 
560 565 570 575 

GTG GAG GAA GGC TAT CGC CTG CCA AGT CCC ATG GAC TGC CCT GCT GCT 1774 
Val Glu Glu Gly Tyr Arg Leu Pro Ser Pro Met Asp Cys Pro Ala Ala 
580 585 " 590 

CTC TAC CAG TTG ATG CTT GAC TGC TGG CAG AAA GAC CGC AAC AGC AGG 1822 
Leu Tyr Gin Leu Met Leu Asp Cys Trp Gin Lys Asp Arg Asn Ser Arg 
595 600 605 

CCC AAG TTT GAT GAA ATT GTC AGC ATG TTG GAC AAG CTC ATC CGT AAC 1870 
Pro Lys Phe Asp Glu He Val Ser Met Leu Asp Lys Leu He Arg Asn 
610 615 620 

CCA AGC AGC TTG AAG ACG TTG GTT AAT GCA TCG AGC AGA GTA TCA AAT 1918 
Pro Ser Ser Leu Lys Thr Leu Val Asn Ala Ser Ser Arg Val Ser Asn 
625 630 635 

TTG TTG GTA GAA CAC AGT CCA GTG GGG AGC GGT GCC TAC AGG TCA GTG 1966 
Leu Leu Val Glu His Ser Pro Val Gly Ser Gly Ala Tyr Arg Ser Val 
640 645 650 655 

GGT GAG TGG CTG GAA GCC ATC AAA ATG GGT CGA TAC ACC GAG ATT TTC 2014 
Gly Glu Trp Leu Glu Ala He Lys Met Gly Arg Tyr Thr Glu He Phe 
660 665 670 

ATG GAG AAT GGA TAC AGT TCG ATG GAT TCT GTG GCT CAG GTG ACC CTA 2062 
Met Glu Asn Gly Tyr Ser Ser Met Asp Ser Val Ala Gin Val Thr Leu 
675 680 685 

GAG GAC GAA TCA CCT TGT GAA AAG TGG AGC CTC ACC CTC CAC CCC CTC 2110 
Glu Asp Glu Ser Pro Cys Glu Lys Trp Ser Leu Thr Leu His Pro Leu 
690 695 700 

TTT CCA ACT GGA TAT CAG ACT TGAAGGAAAC CTTTCCAGTG GACCAGACCT 2161 
Phe Pro Thr Gly Tyr Gin Thr 
705 710 

GCTCTTTAAA CTTGTGGACC ACCTAGTGAC TTTGAGTGTG TCTGGAGCTC TTTCAATCCA 2221 

CTGCAAGAAT AACTTTACCA GGACAGTACT CAAGAATAGA TAGATCCATG ACATGAGTTT 2281 

CAGTCTGATA TTTGACTGGA CCAATTACTA ACAAAATGTG GACTGCATAC TTACACCTTT 2341 

TGAAAGATCT GTACTCACCG AATCTCAGGA CACCCTGTTG TTTGTTATTA GATGAAGAAC 2401 

TCTGAATATT TGTAATAATA TGTGATGTGT TGCTTTGCAT TGTATTTTTT TCTTATAAAA 2461 

TAAAATAAAT TATTTATTAA AAGTTATACT GGGATGAAGA CCATTTAAGA GTTCACCTGC 2521 

TCTAGATGCT TATTCTTAAC CTGAAACCTC AGTTCCGGAT AGTGATACTG CACACGCTTG 2581 

TGAACAAACC CATTCTCGTG TCATAACCAA ACAGGATGGG AGTAATGAAT AAGAGCAGAT 2641 
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GAACTCTTAA AAGAAAGATC CTAATCTCAT GCAAAGGTCC CTTGCAAGTG GATTCCTCTC 2701 

TCCCTAGCGT CTTCTAAAGG TCTTTGAGGT TATTCTTTCC CCTCTTTCAA ACTGACAGCT 2761 

AACTCTGTGA GTAGTGTCAG TCTGCATGGG CCAGTGTAGA ACTGCACCAT GTTGAAGAAG 2821 

AGTGCTGCAA TATGGCTGGG GTGGGAGATG AAATGCAAAG TAATCTCTGG TAGG CTGATG 2881 

GCTTCCAGCC ATGGAGGTAT TTCAGGAACC TGGCCCTTTT GCTTGCATGA GTAATGAATG 2941 

GAGTGGTGAG GAGTGTTGTA TTTTATGTGG CAATCCAGTC CTAGTCTACA CTGTGTTTGA 3001 

CAAATTGGTC CATGGTGTAT AAGTAGTTCT ATTTGTAAAT AAAATGTTTT AAATG 3056 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS : 

<A) LENGTH: 710 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

Leu Lys Phe Thr Leu Arg Asp Cys Asn Ser Leu Pro Gly Gly Leu Gly 
1 5 10 15 

Thr Cys Lys Glu Thr Phe Asn Met Tyr Tyr Phe Glu Ser Asp Asp Glu 
20 25 30 

Asp Gly Arg Asn lie Arg Glu Asn Gin Tyr lie Lys lie Asp Thr lie 
35 40 45 

Ala Ala Asp Glu Ser Phe Thr Glu Leu Asp Leu Gly Asp Arg Val Met 
50 55 60 

Lys Leu Asn Thr Glu Val Arg Asp Val Gly Pro Leu Thr Lys Lys Gly 
65 70 75 80 

Phe Tyr Leu Ala Phe Gin Asp Val Gly Ala Cys lie Ala Leu Val Ser 
85 90 95 

Val Arg Val Tyr Tyr Lys Lys Cys Pro Ser Val lie Arg Asn Leu Ala 
100 105 110 

Arg Phe Pro Asp Thr lie Thr Gly Ala Asp Ser Ser Gin Leu Leu Glu 
115 120 125 

Val Ser Gly Val Cys Val Asn His Ser Val Thr Asp Glu Ala Pro Lys 
130 135 140 

Met His Cys Ser Ser Glu Gly Glu Trp Leu Val Pro lie Gly Lys Cys 
145 150 155 160 

Leu Cys Lys Ala Gly Tyr Glu Glu Lys Asn Asn Thr Cys Gin Ala Pro 
165 170 175 

Ser Pro Val Ser Ser Val Lys Lys Gly Lys lie Thr Lys Asn Ser lie 
180 185 190 

Ser Leu Ser Trp Gin Glu Pro Asp Arg Pro Asn Gly He He Leu Glu 
195 200 205 

Tyr Glu He Lys Tyr Phe Glu Lys Asp Gin Glu Thr Ser Tyr Thr He 
210 215 220 
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He Lys Ser Lys Glu Thr Ala He Thr Ala Asp Gly Leu Lys Pro Gly 
225 230 235 240 

Ser Ala Tyr Val Phe Gin He Arg Ala Arg Thr Ala Ala Gly Tyr Gly 
245 250 255 

Gly Phe Ser Arg Arg Phe Glu Phe Glu Thr Ser Pro Val Leu Ala Ala 
260 265 270 

Ser Ser Asp Gin Ser Gin He Pro He He Val Val Ser Val Thr Val 
275 280 2B5 

Gly Val He Leu Leu Ala Val Val He Gly Phe Leu Leu Ser Gly Arg 
290 295 300 

Arg Cys Gly Tyr Ser Lys Ala Lys Gin Asp Pro Glu Glu Glu Lys Met 
305 310 315 320 

His Phe His Asn Gly His He Lys Leu Pro Gly Val Arg Thr Tyr He 
325 330 335 

Asp Pro His Thr Tyr Glu Asp Pro Asn Gin Ala Val His Glu Phe Ala 
340 345 350 

Lys Glu He Glu Ala Ser Cys He Thr He Glu Arg Val He Gly Ala 
355 360 365 

Gly Glu Phe Gly Glu Val Cys Ser Gly Arg Leu Lys Leu Gin Gly Lys 
370 375 380 

Arg Glu Phe Pro Val Ala He Lys Thr Leu Lys Val Gly Tyr Thr Glu 
385 390 395 400 

Lys Gin Arg Arg Asp Phe Leu Gly Glu Ala Ser He Met Gly Gin Phe 
405 410 415 

Asp His Pro Asn He He His Leu Glu Gly Val Val Thr Lys Ser Lys 
420 425 430 

Pro Val Met He Val Thr Glu Tyr Met Glu Asn Gly Ser Leu Asp Thr 
435 440 445 

Phe Leu Lys Lys Asn Asp Gly Gin Phe Thr Val He Gin Leu Val Gly 
450 455 460 

Met Leu Arg Gly He Ala Ser Gly Met Lys Tyr Leu Ser Asp Met Gly 
465 470 475 480 

Tyr Val His Arg Asp Leu Ala Ala Arg Asn He Leu He Asn Ser Asn 
485 490 495 

Leu Val Cys Lys Val Ser Asp Phe Gly Leu Ser Arg Val Leu Glu Asp 
500 505 510 

Asp Pro Glu Ala Ala Tyr Thr Thr Arg Gly Gly Lys He Pro He Arg 
515 520 525 

Trp Thr Ala Pro Glu Ala He Ala Phe Arg Lys Phe Thr Ser Ala Ser 
530 535 540 

Asp Val Trp Ser Tyr Gly He Val Met Trp Glu Val Met Ser Tyr Gly 
545 550 555 560 

Glu Arg Pro Tyr Trp Glu Met Thr Asn Gin Asp Val He Lys Ala Val 



565 



570 



575 
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Glu Glu Gly Tyr Arg Leu Pro Ser Pro Met Asp Cys Pro Ala Ala Leu 
580 585 590 

Tyr Gin Leu Met Leu Asp Cys Trp Gin Lys Asp Arg Asn Ser Arg Pro 
595 * 600 " 605 

Lys Phe Asp Glu lie Val Ser Met Leu Asp Lys Leu lie Arg Asn Pro 
610 615 620 

Ser Ser Leu Lys Thr Leu Val Asn Ala Ser Ser Arg Val Ser Asn Leu 
625 630 635 640 

Leu Val Glu His Ser Pro Val Gly Ser Gly Ala Tyr Arg Ser Val Gly 
645 650 655 

Glu Trp Leu Glu Ala He Lys Met Gly Arg Tyr Thr Glu He Phe Met 
660 665 670 

Glu Asn Gly Tyr Ser Ser Met Asp Ser Val Ala Gin Val Thr Leu Glu 
675 680 685 

Asp Glu Ser Pro Cys Glu Lys Trp Ser Leu Thr Leu His Pro Leu Phe 
690 695 700 

Pro Thr Gly Tyr Gin Thr 
705 710 

(2) INFORMATION FOR SEQ ID NO:23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 
<D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID N0:23: 

Arg He Cys Thr Pro Asp Val Ser Gly Thr Val Gly Ser Arg Pro Ala 
15 10 15 

Ala Asp His 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 13 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

Cys Leu Glu Thr His Thr Lys Asn Ser Pro Val Pro Val 
15 10 



WO 95/15375 PCT/US94/10140 



112 



(2) INFORMATION FOR SEQ ID NO:25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

Lys Met Gin Gin Met His Gly Arg Met Val Pro Val 
15 10 

(2) INFORMATION FOR SEQ ID NO: 26 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: 

Lys Val His Leu Asn Gin Leu Glu Pro Val Glu Val 
15 10 
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What is claimed is: 

1. A composition of matter, comprising an 
isolated nucleic acid sequence encoding a Eph-related 
protein tyrosine kinase, or functional fragment thereof, 
having about 23 to 66 percent amino acid sequence identity 

5 in its carboxyl terminal variable region compared to known 
members of the Eph subclass of tyrosine kinases. 

2. The composition of claim 1, comprising 
substantially the same nucleotide sequence selected from 
the group consisting of SEQ ID NOS: 3, 7, 9, 11, 13, 19 and 

10 21. 

3. A composition of matter, comprising a vector 
containing the nucleic acid of claim 1. 

4. The composition of claim 3, wherein said 
vector is for the expression of a recombinant Eph- related 

15 protein tyrosine kinase. 

5. The composition of claim 4, wherein said 
expression is in a procaryotic host. 

6 . The composition of claim 4 , wherein said 
expression is in a eucaryotic host. 

20 7. A composition of matter, comprising a host 

cell containing the vector of claim 3 . 

8. The composition of claim 7, wherein said 
host cell is procaryotic. 

9. The composition of claim 7, wherein said 
25 host cell is eucaryotic. 
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10. A composition of matter, comprising a 
substantially purified Eph-related protein tyrosine kinase, 
or functional fragment thereof, having about 23 to 66 
percent amino acid sequence identity in its carboxyl 

5 terminal variable region compared to known members of the 
Eph subclass of tyrosine kinases. 

11. The composition of claim 10, comprising 
substantially the same amino acid sequence selected from 
the group consisting of SEQ ID NOS: 4, 8, 10, 12, 14, 20 

10 and 22. 

12. A composition of matter, comprising a 
substantially purified chicken Eph-related protein tyrosine 
kinase, or functional fragment thereof having substantially 
the same amino acid sequence of SEQ ID NO: 2. 

15 13 . A composition of matter, comprising a 

substantially purified chicken Eph-related protein tyrosine 
kinase, or functional fragment thereof having substantially 
the same amino acid sequence of SEQ ID NO: 6. 

14 . A method of diagnosing cancer, comprising 
20 removing a tissue or cell sample from a subject suspected 

of having cancer and determining the level of Eph-related 
protein tyrosine kinase in said sample, wherein a change in 
the level or activity of a Eph-related protein tyrosine 
kinase compared to" a normal sample indicates the presence 
25 of a cancer or correlates with a specific prognosis. 

15. The method of claim 13, wherein an increase 
in said change in the level or activity of a Eph-related 
protein tyrosine kinase indicates the presence of a cancer 
or correlates with a specific prognosis. 
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16. The method of claim 13, wherein a decrease 
in said change in the level or activity of a Eph-related 
protein tyrosine kinase indicates the presence of a cancer 
or correlates with a specific prognosis. 

5 17. The method of claim 12, wherein said cancer 

is selected from the group consisting of liver carcinoma , 
lung carcinoma, breast carcinoma, colon carcinoma and 
leukemia. 
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FIG.3 
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